logo
ResearchBunny Logo
Introduction
Real-time tracking of COVID-19 spread is challenging due to the delay between infection and reporting (approximately 9 days). This delay hinders timely public health interventions. Previous research has shown that a significant portion of COVID-19 transmission occurs pre-symptomatically, and superspreading events contribute to rapid outbreaks. Digital proxies, such as mobility data, have shown promise in monitoring disease transmission and intervention effectiveness. This study aims to develop a framework that integrates these digital proxies into conventional epidemic models to provide near real-time tracking of COVID-19 transmissibility and generate accurate nowcasts and short-term forecasts. The ability to achieve this would be a significant advancement in pandemic response, allowing for more rapid and effective implementation of public health measures.
Literature Review
Existing research has explored the use of digital proxies of human mobility and mixing to understand disease transmission dynamics. Studies have shown the utility of such data in monitoring the effectiveness of social distancing interventions during the COVID-19 pandemic. However, a key limitation has been the lack of a robust framework integrating these data directly into established epidemic models to provide real-time insights into transmission rates. This paper builds upon this previous work by developing a novel framework that directly incorporates these digital mobility data into a conventional age-structured SIR model to improve the accuracy and timeliness of epidemiological predictions.
Methodology
The study used COVID-19 case data from Hong Kong's Centre for Health Protection (CHP), stratified by imported and local cases. The epidemic curves were approximated by deconvoluting the date-of-onset curves with the incubation period distribution to estimate the date of infection. The instantaneous effective reproduction number (Rt) was estimated using the method described in Thompson et al. (2019). Octopus card transaction data, a ubiquitous payment system in Hong Kong, served as a digital proxy for population mixing, stratified by age group (children, students, adults, elderly) and transaction type (transport, retail). The correlation between the empirical Rt estimates and Octopus transaction data was calculated to assess the validity of the digital proxy. An age-structured susceptible-infectious-removed (SIR) model was parameterized using the age-specific Octopus transport transaction data. Scaling factors translating the digital proxies into the contact matrix were inferred by fitting the model to the epidemic curve of local cases. The model was used for nowcasting (estimating current infections not yet reported) and short-term forecasting. Model performance was assessed using metrics proposed by Funk et al. (2019), including sharpness, bias, ranked probability score (RPS), Dawid-Sebastiani score (DSS), and absolute error (AE). Sensitivity analysis was performed to evaluate the impact of assumptions regarding the generation time distribution on the model's results. Additional sensitivity analysis was performed by incorporating household contact patterns into the framework, using data from a previous social contact survey.
Key Findings
The empirical Rt estimate in Hong Kong was approximately 2.5 at the start of community transmission, rapidly decreasing to around 1 after interventions were implemented. A subsequent rebound to 2.5 occurred due to relaxed measures and increased importation of cases, followed by another drop due to renewed public health measures and spontaneous physical distancing. Octopus transport transactions showed a strong positive correlation with the empirical Rt estimates (r = 0.62–0.80 across age groups). The Rt estimates from the fitted model showed a high correlation with the empirical Rt estimates (r = 0.98). The model estimated that only 23% (13–47%) of local infections were ascertained by official surveillance. Model fitting at multiple time points showed that the inferred relationship between transmission dynamics and digital proxies remained stable over time. The framework successfully generated near real-time Rt estimates, overcoming the delay inherent in case reporting. Nowcasts and forecasts were generally accurate, except for periods with superspreading events or very low prevalence, which were underestimated by the deterministic model. Sensitivity analysis showed that the accuracy and precision of nowcasts and forecasts were not significantly affected by assumptions about the generation time distribution. The incorporation of household contact patterns did not improve model performance, suggesting that community transmission was the dominant driver of local cases.
Discussion
The study demonstrates the feasibility and accuracy of using digital proxies of population mobility to accurately track and predict COVID-19 transmission dynamics in near real-time. The strong correlation between Octopus card transactions and the effective reproduction number highlights the value of integrating readily available digital data into epidemiological modeling. The model's ability to nowcast and forecast the epidemic, even accounting for the delay in case reporting, provides valuable information for timely public health interventions. While the model performed well overall, the influence of superspreading events and the stochasticity inherent in low-prevalence settings should be considered in future model development. The robustness of this approach relies on the availability and quality of digital proxies that accurately reflect population mixing patterns.
Conclusion
This study provides a novel framework for real-time tracking and prediction of COVID-19 transmission using digital proxies of population mobility. The integration of readily available data into established epidemiological models enables near real-time estimation of Rt and accurate nowcasting and forecasting, which can greatly improve the timeliness and effectiveness of public health interventions. Future research could explore the incorporation of more sophisticated models to account for superspreading events and the effects of stochasticity, and to apply this framework to other infectious diseases and geographic contexts. The increasing availability of digital data provides a powerful opportunity to improve real-time pandemic response globally.
Limitations
The study is limited to Hong Kong, and the generalizability to other settings depends on the availability and quality of comparable digital mobility data. The model's performance was affected by superspreading events and the stochastic nature of transmission in low-prevalence settings. The accuracy of the model also depends on the assumption that the Octopus card data accurately represent population mixing patterns. Future research is needed to validate the findings in other contexts and to improve the model's ability to capture the complexities of disease transmission.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny