logo
ResearchBunny Logo
Passive detection of COVID-19 with wearable sensors and explainable machine learning algorithms

Medicine and Health

Passive detection of COVID-19 with wearable sensors and explainable machine learning algorithms

M. Gadaleta, J. M. Radin, et al.

This exciting study led by Matteo Gadaleta, Jennifer M. Radin, and their team reveals the promising potential of a machine learning model to accurately detect COVID-19 infections using data from wearable devices. The research demonstrates how scalable and passive monitoring can be achieved even without self-reported symptoms, marking a significant advancement in public health monitoring.

00:00
Playback language: English
Introduction
The need for rapid identification and isolation of COVID-19 cases necessitates efficient monitoring strategies beyond frequent diagnostic testing and self-reported symptoms, which are often inaccessible, unreliable, and miss asymptomatic individuals. Passive monitoring using commercially available wearable sensors offers a promising alternative. Previous research demonstrated correlations between physiological signals (heart rate, sleep, activity) and COVID-19 infection, especially when combined with self-reported symptoms. However, these studies often focused on specific devices or signals. This research aimed to develop a device-agnostic machine learning algorithm adaptable to various wearable sensors and capable of detecting COVID-19 infections even in the absence of self-reported symptoms, addressing the limitations of existing approaches.
Literature Review
Existing literature highlights the potential of wearable sensor data in identifying COVID-19 infection. Studies have shown correlations between changes in heart rate variability, sleep patterns, and activity levels with COVID-19. However, these studies often focused on specific device brands or pre-defined signal sets, limiting their generalizability. The need for device-agnostic algorithms capable of adapting to varying data types and handling the absence of self-reported symptoms remains a critical gap in the field. This study aims to bridge this gap by developing an adaptable and explainable machine learning model.
Methodology
This prospective study utilized the DETECT (Digital Engagement and Tracking for Early Control and Treatment) app-based research platform. Participants (38,911; 61% female, 19% over 65) shared data from various wearable devices via Google or Apple HealthKit. The study included self-reported symptoms, COVID-19 test results, and sensor data (resting heart rate, sleep duration, activity levels). A gradient boosting prediction model based on decision trees was developed to detect COVID-19 infection. The model was trained and tested under various conditions: including or excluding self-reported symptoms and considering data before and after the test date to assess the impact of post-test behavioral changes. The model's performance was evaluated using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Feature importance analysis was conducted to identify the most influential variables in the prediction model. Baseline values were calculated for each individual using a weighted average of past data, excluding the first six days and periods with reported symptoms. Normalized deviations from these baselines were used as features in the model.
Key Findings
The study demonstrated that the machine learning model effectively discriminated between COVID-19 positive and negative individuals. For the symptomatic cohort (individuals reporting at least one symptom before the test), the model achieved an AUC of 0.83 (0.81–0.85) when using all available data and 0.78 (0.75–0.80) when using only pre-test data. In the non-symptom-reported cohort, the AUC was 0.74 (0.72–0.76) (all data) and 0.66 (0.64–0.68) (pre-test data). Self-reported symptoms were the most important features for prediction in the symptomatic cohort, accounting for 60% (pre-test data) and 46% (all data) of the model's predictive power. In the absence of self-reported symptoms, activity sensor features became more prominent, increasing from 46% to 54% in importance when only pre-test data was considered. Among the symptoms, fever, chills, fatigue, headache, and muscle ache showed the highest relative contribution to the model's predictions. The analysis showed a significant difference in model output between positive and negative individuals, even when considering only pre-test data. This suggests the model's ability to detect infection before the onset of symptoms.
Discussion
The findings demonstrate the feasibility of passively detecting COVID-19 infection using wearable sensor data, even in the absence of self-reported symptoms. The model's adaptability to different sensor types and engagement levels significantly broadens its applicability. The high AUC values, particularly in the symptomatic cohort, indicate the model's strong predictive performance. The observed changes in feature importance based on symptom reporting highlight the model's ability to adapt to different data availability scenarios. The ability to predict COVID-19 infection using only pre-test data suggests the potential for early detection and intervention, which is crucial for controlling the spread of the disease. This passive monitoring approach has the potential to be scaled up and applied in various settings where self-reported symptom data is limited or unavailable.
Conclusion
This study successfully demonstrated a machine learning model capable of passively detecting COVID-19 infection using wearable sensor data. The model's adaptability to varying data types and its performance even in the absence of self-reported symptoms provide a valuable tool for public health surveillance. Future research should focus on expanding the study population, including more diverse sensor data, and exploring the integration of additional physiological signals (e.g., respiratory rate, peripheral temperature) for improved accuracy and earlier detection of COVID-19 infection. Further investigation into the use of this passive monitoring approach in different settings and populations is warranted.
Limitations
The study's reliance on self-reported data for COVID-19 test results introduces potential biases. The generalizability of the model might be affected by the study population's characteristics. The relatively small number of positive cases in the non-symptom-reported cohort could limit the statistical power of the analysis in that subgroup. Further research is needed to validate these findings in larger, more diverse populations and to explore the long-term implications of this approach.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny