logo
ResearchBunny Logo
Introduction
The COVID-19 pandemic spurred the development of numerous prediction models to inform public health decisions. Identifying reliable leading indicators for outbreaks is crucial for effective policy. Factors like mask-wearing, weather, and demographics have been linked to infection rates. The impact of non-pharmaceutical interventions (NPIs), such as government lockdowns, is well-studied, though disentangling the effects of overlapping NPIs remains challenging. Cell phone mobility data offers an appealing alternative to government mandates, providing direct observation of human movement and potentially offering a better proxy for risky in-person interactions. Publicly available data sources like Google's Community Mobility Reports and SafeGraph's data received widespread attention as mobility plummeted during the pandemic's first wave, featuring prominently in media and epidemiological dashboards. While many studies have used mobility data to predict COVID-19 spread, their conclusions often aren't broadly applicable beyond the initial wave. Limitations include short data spans (often only through June 2020), focus on major cities or coarser state-level analysis, and the assumption of a stationary relationship between mobility and infection rates. This assumption is questionable given shifts in behavior and adherence to government guidance. Capturing the time-varying relationship is challenging due to incomplete, heterogeneous, and non-stationary data, including limitations in mask-wearing data and challenges with reported case data (reporting delays, day-of-week effects, and varying testing rates). The short time frame of the data also hampers adjustments for seasonality. This study aims to identify a flexible, interpretable model to disentangle the changing effect of mobility over time and space.
Literature Review
A substantial body of research uses mobility data to predict COVID-19 spread, but many findings are limited in scope. Early studies focused on the first few months of the pandemic (up to June 2020) and often limited their analysis to major cities or aggregated state-level data. A key limitation in most studies (except one) is the assumption of a stationary relationship between mobility and infection rates, which is unrealistic considering behavioral changes and varying adherence to NPIs. Many studies overlooked crucial confounding factors, including mask adoption, testing rates, and reporting delays. The lack of reliable data on adherence to mask-wearing further complicated the analysis, making it difficult to isolate the effects of mobility. Existing methods suffer from restrictive assumptions which limit generalizability. The research gap this paper addresses is the need for a more flexible and interpretable model capable of accounting for spatiotemporal variations in the relationship between mobility and infection rates.
Methodology
The study uses a multilevel regression model to analyze county-level data across the US for one year. The model allows the association between mobility and infection growth rates to vary across groups of nearby counties and over four 13-week “waves.” The model uses Google's Mobility Trends data which includes six types of mobility (grocery/pharmacy, residential, retail/recreation, workplace, transit, parks). The infection growth rate is calculated as the log ratio of total infections over consecutive two-week periods. The authors first examine weekly county-level mobility and infection growth rate trends, visualizing data across US Census divisions and combined statistical areas (CSAs). They illustrate how overly flexible models can lead to overfitting and misleading inferences, such as spurious correlations due to collinearity of mobility measures. They address this by using principal component analysis to create a single mobility variable capturing over 60% of the original six mobility variables. The final model incorporates this univariate mobility measure, allowing its effect to vary across four 13-week waves. To address missing data, the authors employ Multivariate Imputation by Chained Equations (MICE). The model includes covariates such as population, temperature, and a national mask-use effect (modeled as constant over time). The effect of mobility is allowed to vary by CSA and across waves. Model performance is assessed using R². The authors compare their final model to simpler models with less flexibility in terms of spatial and temporal variations to demonstrate that more flexibility generally yields higher R². The robustness of their conclusions is assessed by comparing the results using Google's data with those obtained using SafeGraph's completely at-home data.
Key Findings
The study replicates earlier findings that mobility was strongly associated with infection rates during the first wave of the pandemic (February 29, 2020 – May 23, 2020), particularly in the most populous counties. However, this association weakened considerably after the first wave. The association between mobility and infection rates varied substantially across different locations and time periods. Overly flexible models were shown to lead to spurious conclusions. The study's final, more carefully constrained model reveals significant spatiotemporal heterogeneity in the relationship between mobility and infection growth rates. The model fit was best during the first few months and in the most populated counties. R² values were generally lower in rural areas and during later waves of the pandemic. The model indicates a strong association between mobility and infection rates in New York City during waves one and two, but not in wave three. In contrast, Green Bay showed a strong association in waves two through four. San Francisco showed only a moderate association in the fourth wave. The authors conducted an ablation study by simplifying the model to limit the variability of the mobility effect by time and space. The findings showed that simpler models averaging over local or temporal effects provide misleading results, masking the true spatiotemporal variation. Including a national mask effect in the model improved the fit during the first wave, indicating the confounding effect of simultaneous mask adoption and reduced mobility. Results were similar when using SafeGraph's data instead of Google's, showing consistency during the first wave, but a decline in correlation later, suggesting decreased reliability of broad cell phone mobility measures in capturing person-to-person contact patterns.
Discussion
This study demonstrates that the relationship between mobility and COVID-19 infection rates is complex and not consistent across time and space. While mobility was a strong predictor during the first wave, particularly in densely populated areas, this association weakened significantly thereafter. The findings highlight the limitations of using coarse mobility data as a sole indicator of infection risk. Mobility is only an imperfect proxy for risky in-person interactions, a factor that is further complicated by changing behaviors (mask usage, social distancing, hygiene) and variations in the reliability of different mobility measures over time. The choice of model is also critical. Overly flexible models can lead to spurious correlations and overfitting. The authors’ flexible yet interpretable multilevel model provides a more nuanced understanding of the spatiotemporal dynamics. The study underscores the need for more targeted and context-specific approaches to using mobility data in public health interventions. Future research should explore finer-grained mobility data that better capture specific types of interactions (e.g., school attendance) and account for evolving behaviors and other factors influencing transmission.
Conclusion
This study provides valuable insights into the dynamic and heterogeneous relationship between cell phone mobility and COVID-19 spread in the US. The findings demonstrate that mobility is not a consistently reliable leading indicator of infection rates across time and space. The study highlights the importance of using flexible models that account for spatiotemporal variations and the need for caution in interpreting broad mobility data. Future work could involve incorporating more detailed mobility data, exploring the impact of other factors (like school closures and the presence of more infectious variants), and developing more advanced models to better predict infection dynamics.
Limitations
The study's conclusions are subject to limitations. The quality of reported case data is affected by testing rates and reporting delays. While the authors used a robust method to estimate true infection incidence, they acknowledge that longer-term trends in differential testing aren't fully captured. The lack of detailed mask-wearing behavior data, especially early in the pandemic, hindered efforts to completely disentangle mask and mobility effects. Missing data necessitated imputation, and the assumptions behind these imputation models are not easily verified. The short time-frame of the data limits the ability to account for seasonality and longer-term trends. Finally, the authors recognize that different types of mobility may be better proxies for risky behaviors at different times, and that the currently analyzed mobility metrics do not account for factors like school attendance.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny