logo
ResearchBunny Logo
Measuring the impact of COVID-19 on heritage sites in the UK using social media data

Humanities

Measuring the impact of COVID-19 on heritage sites in the UK using social media data

Z. Liu, S. A. Orr, et al.

This study examines the effects of COVID-19 on UK heritage sites through a detailed analysis of 1.4 million Google Maps visitor reviews. Conducted by Ziwen Liu, Scott Allan Orr, Pakhee Kumar, and Josep Grau-Bove, it reveals changes in visitor sentiment and behavior, highlighting the challenges faced by urban indoor sites as the pandemic evolves.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates how COVID-19 and associated non-pharmaceutical interventions affected visitor numbers and experiences at UK heritage sites. The pandemic reshaped tourism by altering risk perceptions and behaviors of tourists and stakeholders, with closures, social distancing, face coverings, and hygiene measures influencing site operations and visitor experience. Impacts varied geographically and by site characteristics (urban vs rural; indoor vs outdoor), with outdoor and rural destinations generally more resilient than indoor and urban sites. Given the economic importance of heritage tourism to the UK and the availability of extensive social media data, the authors aim to: (1) quantify the pandemic’s impact on visitor involvement using online reviews as a proxy for visitation and (2) measure visitor sentiment toward COVID-19 preventive measures. The study showcases the utility of advanced NLP and computer vision methods to extract actionable insights from large-scale, noisy social media data for heritage management and recovery planning.
Literature Review
Prior work shows tourism was among the hardest hit sectors by COVID-19 globally, with impacts varying by country development level and site type. Studies noted urban tourism suffered more than rural, and outdoor venues were perceived as safer than indoor ones. Within the UK, heritage organizations faced severe viability risks during lockdowns, but the crisis also prompted digital engagement innovations. Social media analytics have been leveraged to study pandemic-related perceptions: Instagram hashtag analyses indicated positive social values during closures; Twitter-based studies using VADER and topic modeling found mixed but often polarized sentiments around face coverings, shifting emotions related to vaccines, and strong salience of stay-at-home topics. Methodologically, prior research commonly applied lexicon-based sentiment and LDA topic modeling. This paper advances the literature by applying weakly supervised and zero-shot/deep learning NLP to large-scale Google Maps reviews to decode visitor perceptions of specific COVID-19 measures at heritage sites and track changes over time.
Methodology
Data and inclusion: The authors compiled Google Maps reviews from February 2006 to April 2022 for 775 UK heritage-related sites, selected via criteria: >100 reviews; primarily cultural heritage/tourism; and major leisure comparators (e.g., national parks). Sources included portfolios of English Heritage, National Trust (England, Wales, NI), National Trust for Scotland, Historic Environment Scotland, VisitBritain’s most-visited attractions lists, ALVA’s most-visited museums (2019), Historic England’s “A History of England in 100 Places,” and other landmark sites. After excluding non-English and rating-only entries, ~1.4 million textual reviews remained. For each site, up to 100 user photos were sampled (one per unique uploader). Classifying outdoorness and urbanness: For outdoorness, the Places365 CNN model estimated the proportion of outdoor photos per site. To counter bias toward outdoor imagery (e.g., photo restrictions indoors, travel-route photos), the threshold for classifying a site as outdoor used the lower bound of the 1% confidence interval from known outdoor sites, yielding a threshold of 0.83; sites below were classified as indoor. For urbanness, Ordnance Survey Code-Point postcodes within a 10 km radius were used as a proxy for population density; a threshold of 24 postcodes/km² (calibrated with known urban sites like central London) classified sites as urban vs rural. Detecting COVID-19-related comments: Two approaches were combined: (1) keyword matching (e.g., covid, coronavirus, social distance/ing, pandemic, delta, omicron) and (2) a natural language inference (NLI) approach using BERTweet-large fine-tuned on MNLI. Each review paired with the hypothesis “This sentence has user’s review about COVID-19” was assigned an entailment probability; a conservative threshold of 0.92 was selected to minimize false positives pre-2020. Visitor involvement via online reviews: The number of reviews per month served as a proxy for visitor involvement and visitation volume. To validate stability of review propensity, the authors compared DCMS-sponsored museums and galleries’ monthly visitor counts (2016–2022) to their Google Maps review counts, finding a stable ratio outside closure periods (2020–early 2021). A log–log regression between monthly visitors and reviews showed a strong linear relationship (R²=0.73), supporting reviews as a proxy for visitation. Expected monthly review counts were forecast using ARIMA based on pre-pandemic trends to quantify shortfalls during/after the pandemic. Inbound tourism correlation: Monthly overseas travel and tourism statistics (ONS) provided inbound visitor flows by source (Europe vs non-Europe) and purpose (holiday, business, visiting friends/relatives, miscellaneous). Reductions since 2019 were correlated with reductions in monthly reviews for urban vs rural sites. Sentiment analysis setup: Each review’s 1–5 rating was binarized into positive (above mean) vs negative (below mean), with a highly skewed distribution (mean=4.39; >62% at 5 stars). Two sentiment analyses were conducted: - Document-level subtopics: Four COVID-19 measure subtopics were targeted—face coverings, social distancing (incl. one-way systems/queuing), restrictions/closers, and sanitization (hygiene equipment such as hand gel). Detection used keyword lists plus BFV (a weakly supervised multi-label classifier) seeded with those keywords. A fuzzy fusion labeled subtopic presence as 1 if both methods agreed, 0 if both absent, and 0.5 if they disagreed. Logistic regressions related subtopic presence to sentiment, run separately for indoor and outdoor sites. - Word-level attributions: A DistilBERT-based binary sentiment classifier was fine-tuned on COVID-related reviews. Integrated Gradients were used to attribute word-level contributions to predicted sentiment across the corpus, aggregating at the token level to identify words most associated with positive/negative sentiment. Pseudo-qualitative summaries: For clarity on negative sentiments around face coverings and social distancing, Pegasus (fine-tuned on reddit_tifu) summarized the 25 most representative negative reviews nearest the subtopic centroid, to infer whether negativity stemmed from rule enforcement failures vs discomfort with rules.
Key Findings
- Scale and detection: ~1.4 million reviews were collected for 775 sites; 15,300 reviews were detected as COVID-19-related across 689 sites (86 sites had none). Distribution by type among these 689 sites: Rural—Indoor 92, Outdoor 237; Urban—Indoor 183, Outdoor 177. - Review–visitor proxy validity: Outside closure periods, the visitor-to-review ratio was stable post-2016 for DCMS sites. The log–log regression between monthly visitors and Google reviews yielded R²=0.73, supporting reviews as a reliable visitation proxy. - Recovery lag: By late 2021 into mid-2022, mentions of COVID-19 declined markedly, yet actual review volumes remained substantially below ARIMA-expected levels, indicating a lagging recovery in visitor involvement—more pronounced at indoor sites than outdoor sites. - Urban vs rural correlations: Reductions in monthly reviews for urban sites correlated significantly and positively with reductions in inbound visitors from Europe and with holiday-purpose travel; these correlations were not significant for rural sites. This suggests urban heritage sites, often reliant on international holidaymakers, were more adversely affected. - Sentiment by subtopic and site type: Sanitization mentions were significantly associated with positive sentiment at indoor sites (but not significant outdoors), indicating perceived safety benefits in enclosed settings. Social distancing was significantly associated with positive sentiment outdoors (but not indoors), possibly reflecting easier implementation and perceived efficacy in open spaces. Mentions of restrictions/closures and face coverings were significantly associated with negative sentiment in both indoor and outdoor contexts. - Word-level sentiments: Words like “closed” and “restrictions” were strongly negative; “COVID” often aligned with positive sentiment, reflecting expressions of relief or enthusiasm about reopening and returning to normality in many comments. - Summarized negative themes: Pegasus summaries indicated that negative sentiments around face coverings and social distancing commonly stemmed from perceived failures in enforcement and non-compliance by other visitors/staff, rather than opposition to the measures themselves. - Classification thresholds: Outdoorness threshold set at 0.83 (Places365-based); urban threshold set at 24 postcodes/km² within 10 km (OS Code-Point). COVID-topic NLI threshold set to 0.92 entailment probability to minimize pre-2020 false positives.
Discussion
The research questions—how COVID-19 affected visitor involvement at UK heritage sites and how visitors perceived specific preventive measures—are addressed by combining large-scale review dynamics with fine-grained sentiment analyses. Despite declining explicit references to COVID-19 post-2021, review volumes remained below pre-pandemic trajectories, evidencing incomplete recovery in engagement/visitation. The stronger shortfalls at indoor sites align with heightened perceived risk and constraints in enclosed spaces. Significant correlations between urban-site review declines and reductions in European holiday inbound travel indicate that international tourism plays a critical role in urban heritage recovery. Sentiment analyses reveal that while restrictive measures (closures, restricted access) were disliked, hygiene measures in indoor settings and social distancing in outdoor contexts were generally appreciated, suggesting that measure effectiveness and context-appropriateness shape visitor satisfaction. Negative sentiment around masks and distancing often arose from inconsistent enforcement and non-compliance, not from the measures per se. These insights emphasize that clear, consistent implementation and visible adherence by staff and visitors can mitigate dissatisfaction while maintaining safety. For the field, the study demonstrates that social media reviews, analyzed with modern NLP/CV methods, can serve as scalable, retrospective, and cost-effective proxies for both visitation trends and targeted perception monitoring. This approach can inform resource allocation—especially toward urban indoor sites reliant on international visitors—and guide operational policies (e.g., enforcement strategies, access management) during recovery and future disruptions.
Conclusion
This study demonstrates that large-scale social media reviews can effectively quantify pandemic impacts on UK heritage sites, revealing that explicit COVID-19 concerns waned before visitation fully recovered, with prolonged deficits most evident in urban indoor sites dependent on international tourism. Visitors generally accepted safety measures when well-implemented and context-appropriate, while restrictions/closures drew consistent dissatisfaction. Methodologically, weakly supervised and zero-shot/deep learning NLP, coupled with CV, proved effective in extracting structured, survey-like insights from unstructured reviews and in tracking sentiment shifts over time. Managerial implications include prioritizing support and contingency planning for urban indoor heritage sites; ensuring consistent enforcement and compliance with safety measures; and maintaining access where safe to reduce disappointment. The approach offers a scalable, retrospective alternative to traditional surveys, supporting evidence-based decision-making. Future work could extend to multilingual analyses to capture non-English-speaking visitors, triangulate social media with ticketing/mobile mobility data, refine enforcement and wayfinding strategies through causal inference, and apply these methods to other shocks (e.g., natural disasters, economic downturns, conflicts) to monitor perception changes before/after disruptive events.
Limitations
Methodological limitations include: (1) Language ambiguity and informal social media expressions can hinder precise interpretation; (2) Uneven topic coverage in unstructured reviews may yield higher uncertainty for less-discussed topics and lacks the calibration frameworks of traditional surveys; (3) Passive, observational data preclude targeted questioning, limiting hypothesis testing. Application-specific constraints include reliance on Google Maps (potential demographic skew toward younger or more digitally active users), variable data quality due to lack of curation, and English-only analysis that may underrepresent international visitors’ perspectives. Classification choices (thresholds for outdoorness/urbanness, conservative NLI threshold) and ARIMA forecasting assumptions may introduce biases. Sentiment binarization (based on a skewed rating distribution) simplifies nuanced attitudes, and attribution methods (Integrated Gradients) reflect model-dependent interpretations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny