logo
ResearchBunny Logo
Introduction
Cost-effective wearable sensors are transforming healthcare through applications like fitness tracking, remote patient monitoring, and large-scale health studies. Accurate HAR algorithms are essential for interpreting data from these sensors. Current HAR research is limited by the small size and often artificial nature of available labeled datasets, hindering the application of data-hungry deep learning methods. This study addresses this limitation by leveraging the vast, unlabeled UK Biobank accelerometer dataset, containing over 700,000 person-days of free-living activity data from roughly 100,000 participants. This rich dataset allows us to explore the potential of deep learning methods for HAR, overcoming the constraints of small, lab-based datasets that have characterized previous research.
Literature Review
Self-supervised learning (SSL) has shown great promise in leveraging unlabeled data for feature learning. Several SSL methods have been explored for wearable sensor data, including multi-task SSL, masked reconstruction, contrastive learning, and bootstrapping. A recent benchmark study indicated that multi-task SSL is effective in learning generalizable features for various HAR tasks. However, existing methods often used the same data for pre-training and fine-tuning, or relied on small datasets, limiting the generalizability of the pre-trained models. This study builds upon these existing methods, applying multi-task SSL to the massive UK Biobank dataset to develop a robust and generalizable HAR model.
Methodology
The study employs a two-step process. First, a deep convolutional neural network (DCNN) is pre-trained using multi-task SSL on the UK Biobank accelerometer data. Three self-supervised tasks—arrow of time (AoT), permutation, and time warping—are employed. Weighted sampling is used to address the class imbalance inherent in free-living activity data, prioritizing high-movement periods for training. A ResNet-V2 with 18 layers and 1D convolutions is used, with a 1024-dimensional feature vector. The training utilizes Adam optimization with a learning rate of 1e-3 and linear scaling to handle large batch sizes (6000). The model is trained using a distributed computing setup on four Tesla V100 GPUs. Second, the pre-trained network is fine-tuned via transfer learning on eight benchmark HAR datasets to evaluate representation quality and generalization across different activities, populations, and devices. The eight datasets vary widely in size, activity classes, devices used, data collection protocols, and participant characteristics. For evaluation, held-one-subject-out cross-validation (for smaller datasets) or five-fold subject-wise cross-validation (for larger datasets) was used. Results are compared against models trained from scratch and random forest models with time-series features. Additionally, the impact of varying labeled and unlabeled data volumes on downstream performance is assessed through ablation studies.
Key Findings
The pre-trained models consistently outperform baselines across all eight benchmark datasets. The median relative improvement in F1 score is 18.4% compared to models trained from scratch and 8.5% compared to random forest models. The most significant improvements are observed on smaller datasets, highlighting the value of pre-training in data-scarce scenarios. Fine-tuning all layers consistently outperforms fine-tuning only the fully connected layers. Ablation studies demonstrate that the pre-trained models perform well regardless of the number of labeled subjects in downstream tasks, showing robustness in limited-data settings. Increasing the number of unlabelled subjects for pre-training linearly improves downstream performance. However, varying the amount of unlabelled data *per subject* (while keeping the number of subjects constant) shows minimal impact on downstream performance. Cluster analysis using UMAP reveals that self-supervised features cluster similar activities and intensities more effectively than raw inputs or untrained features. Explainable AI methods confirm that the model focuses on relevant motion dynamics for pretext task predictions. Transfer learning experiments using Capture-24 and Rowlands datasets as pre-training sources shows that self-supervised pre-training using the UK Biobank consistently yields superior results.
Discussion
The results demonstrate the efficacy of self-supervised pre-training using the large-scale UK Biobank dataset for improving HAR performance, particularly in scenarios with limited labeled data. The significant and consistent improvements across diverse benchmark datasets highlight the generalizability and robustness of the learned representations. This contrasts with previous studies that relied on smaller datasets or the same data for both pre-training and fine-tuning. The study's success highlights the potential of using massive, unlabeled datasets in conjunction with self-supervised learning to create foundational models for various HAR applications. This approach effectively mitigates the data scarcity problem that has long hampered the field.
Conclusion
This study introduces a state-of-the-art HAR model trained using multi-task self-supervised learning on the large-scale UK Biobank accelerometer dataset. The pre-trained model significantly improves HAR performance across diverse benchmark datasets, particularly in data-scarce settings. The open-sourced model serves as a valuable foundation for future HAR research and applications. Future work could focus on expanding the pre-training data to include more diverse populations and modalities (e.g., electrocardiograms), as well as exploring other advanced self-supervised learning techniques.
Limitations
The pre-training data primarily consists of Caucasian participants from the UK, limiting the generalizability to other populations and regions. The study relies on existing benchmark datasets, some of which lack comprehensive licensing or consent information. Future work should address these limitations by incorporating more diverse datasets and focusing on robust data governance practices.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny