logo
ResearchBunny Logo
How epidemic psychology works on Twitter: evolution of responses to the COVID-19 pandemic in the U.S.

Psychology

How epidemic psychology works on Twitter: evolution of responses to the COVID-19 pandemic in the U.S.

L. M. Aiello, D. Quercia, et al.

This research by Luca Maria Aiello, Daniele Quercia, Ke Zhou, Marios Constantinides, Sanja Šćepanović, and Sagar Joglekar delves into the psychology of public response during the COVID-19 pandemic, uncovering phases of refusal, anger, and acceptance as fear and moralization spread. It highlights how these emotional responses shape societal behavior amidst health crises.

00:00
00:00
~3 min • Beginner • English
Introduction
The study examines whether language used on Twitter during the COVID-19 pandemic reflects Philip Strong’s “epidemic psychology,” which posits three concurrent psycho-social epidemics: fear, moralization, and action. While epidemics disrupt daily life and create uncertainty and anxiety, Strong argued these psycho-social processes are transmitted and shaped through language. The authors target the U.S. during 2020 to test, at scale, if and how these dimensions emerge on social media. Given Twitter’s limitations (representativeness, self-presentation biases, and noise), the work nonetheless aims to provide large-scale, real-time insights into public psycho-social responses, identifying temporal phases in public discourse and their association with key pandemic events.
Literature Review
Prior work has analyzed social media during outbreaks such as Zika, Ebola, and H1N1, focusing on content, behavior, diffusion of information and misinformation, and search queries indicating information-seeking behavior. Psychological responses to COVID-19 have largely been studied via surveys. However, a large-scale empirical test of Strong’s model through language on social media had not been conducted. The paper also discusses known issues with Twitter data (limited user base, demographic skews, bots) but highlights its value for timely, fine-grained population-level signals complementing traditional data sources.
Methodology
Dataset: From the public COVID-19 Twitter dataset (Chen et al., 2020), the authors collected 554,941,519 tweets from Feb 1 to Dec 31, 2020. They localized users to the U.S. by parsing free-text profile locations using custom geospatial expressions for the U.S., city and state names. The result was 6,271,835 unique U.S.-based users posting 12,320,155 English tweets. Extremely high-activity automated accounts (up to 15,823 tweets) were filtered out. Analyses were normalized to user-level fractions to mitigate volume fluctuations and bot impacts. Operationalizing Strong’s model: Three authors line-by-line coded Strong (1990) to extract keywords for fear, morality, and action. Independent keyword lists were intersected to form conservative sets. Keywords were mapped to language categories using established lexicons: LIWC (emotions, social processes, function words), EmoLex (Plutchik emotions; moral foundations), and a pro-social behavior lexicon. Similar keywords were grouped and mapped to one or more categories, informed by prior studies. Temporal signals: A tweet contains a category if any of its words/stems match that category. For a category c and day t, they computed the fraction of active users mentioning c: f_c(t) = U_c(t)/U(t). Fractions were standardized to z-scores over the study period to compare categories. To reduce daily noise, a 7-day moving average was applied to all time series. Change-point detection: For each category, they computed the daily average squared gradient of the smoothed series; then averaged across categories to obtain G(t). Peaks (via scipy.signal.find_peaks) above mean plus standard deviations identified change-points marking collective shifts in language use. External validation and behavioral markers: They compared language trends with alternative NLP signals and mobility data: - Interaction types: An LSTM-based classifier (Choi et al., 2020) estimated likelihoods for social interactions (e.g., emotional support, social proximity, power). Scores were thresholded at the 85th percentile per interaction, binarized, converted to user fractions, and min–max normalized. - Mentions of medical entities: A Bi-LSTM sequence tagger with GloVe and BERT embeddings, trained on Micromed (F1=0.72), extracted symptom entities; user fractions per entity were min–max normalized. - Mobility tracks: Foursquare visit data from 1.5M “always-on” U.S. users across 35 venue categories; daily visit counts were normalized per state and averaged to produce national indicators. All signals were smoothed with a 7-day moving average. Composite real-time measures: To enable real-time phase detection without needing full-period normalization, the authors trained logistic regression classifiers to label days as belonging to each phase using category z-scores. They identified top positive and negative predictors (e.g., refusal: death positive; anger: swear positive vs death negative; acceptance: sad positive vs anxiety negative) and defined parsimonious composite measures as differences of percent changes of the most positively minus most negatively associated categories (e.g., Anger = Δ%swear − Δ%death). These composites reproduced the same change-points as the full measures and were correlated with behavioral markers.
Key Findings
- Three phases identified during the first wave (Feb 1–Apr 15) with change-points on Feb 27 (first infection announced in the U.S.) and Mar 24 (stay-at-home orders): • Refusal phase: Characterized by anxiety and fear; frequent mentions of death (peak +45% on Feb 11 vs period average) and focus on others (“they”). Other categories showed little variation, indicating business-as-usual. • Anger phase (from Feb 27): Surge in negative emotions, predominantly anger; shift from abstract fear to concrete health concerns (risk, body). Pronoun use suggested polarization; home-related words peaked on Mar 16 (+38%) as social distancing guidelines were announced. • Acceptance phase (from Mar 24): Increased references to power/authority (policy enforcement); conflict diminished and sadness predominated; increases in care for others (+19%) and pro-social behavior (+25%); work-related mentions rose (job loss, work-from-home). - Cyclical dynamics across 2020: After the first wave, two additional waves showed language change-points (May 27/Jun 9; Oct 2). Anger re-emerged with contagion surges, notably around 100,000 U.S. deaths and President Trump’s COVID-19 diagnosis. Refusal steadily declined; acceptance increased over the year. Within each wave, a repeated sequence of local maxima appeared: refusal → anger → acceptance. - Composite real-time measures replicated change-points and associated phases with external markers: refusal linked with conflictual interactions and long-range mobility; anger linked with reduced anxiety and peaks in physical health concerns; acceptance linked with support, gratitude, and increased mentions of mental health. - Scale and activity: Average 437k active users/day (min ~72k, max ~1.84M on Mar 18), enabling robust temporal estimation while mitigating bots via user-level metrics and filtering.
Discussion
The findings confirm Strong’s epidemic psychology on a national-scale Twitter corpus: language reflecting fear, moralization, and action co-occurred and evolved with pandemic milestones, forming three temporal phases. Beyond Strong’s descriptive framework, the study uncovers temporal structure—refusal, anger, acceptance—that cycles with contagion waves, indicating that psycho-social responses are recurrent and event-driven. The validation with interaction types, medical symptom mentions, and mobility data strengthens the behavioral relevance of the linguistic signals. The composite real-time measures demonstrate practical utility for embedding psycho-social indicators in epidemiological and mobility models, offering timely insights into public risk perception, conflict, support, and compliance dynamics.
Conclusion
The paper provides the first large-scale operationalization and empirical test of Strong’s epidemic psychology using U.S. COVID-19 Twitter data from 2020. It identifies three recurring phases—refusal, anger, acceptance—aligned with pandemic events, documents their cyclical nature across multiple waves, and links them to behavioral markers. The authors introduce parsimonious, real-time composite measures suitable for integration into epidemiological and urban mobility models. Future work should: compare across epidemics, analyze finer geographic and cultural contexts, relate phases to grief-stage theories, and develop more granular language/event detection to disentangle overlapping psycho-social processes.
Limitations
- Focus on a single epidemic (COVID-19) without cross-epidemic comparison. - Coarse geographic scope (entire U.S.); limited exploration of state-level or cultural differences. - Psycho-social epidemics can overlap, complicating orthogonal interpretation; finer-grained categories or alternative event detection could help. - Twitter-centric analysis with representativeness issues (demographic skew, ~20% adult usage, self-presentation biases) and bot activity; mitigations included user-level metrics and filtering, but residual effects may remain. - Inherent subjectivity in inferring psychological states from language; early-stage surges and rapid shifts are difficult to validate with traditional data.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny