logo
ResearchBunny Logo
Introduction
Step length is a critical indicator of health and disease, particularly showing reductions with aging and neurological disorders. Its importance stems from its strong correlation with gait speed and its predictive power for adverse outcomes such as falls, cognitive decline, dementia, and mortality. Quantitative step length estimations are necessary for accurate monitoring of subtle changes over time, tracking therapy response, and evaluating disease progression. Conventional methods like camera-based systems and instrumented gait mats offer accurate but limited snapshot views, failing to capture the continuous variation in gait patterns across different times and contexts. Continuous monitoring using wearable devices, particularly IMUs, bridges the gap between lab-based assessments and real-world function, providing more comprehensive gait characterization. While IMUs can capture relevant data, estimation methods employing double integration, kinematic modeling, and regression techniques have limitations in accuracy and generalizability. Previous studies using machine learning for step length estimation from IMU data have shown promise but often suffered from small, homogenous datasets, limiting generalizability. This study aims to develop a robust, generalized machine-learning model to accurately estimate step length from a single lower-back IMU across diverse populations, including those with Parkinson's disease, mild cognitive impairment, multiple sclerosis, and healthy controls, without requiring calibration or demographic data. The investigation also explores the trade-off between single-step and averaged-step estimations to balance accuracy with the ability to assess step-to-step variability.
Literature Review
Existing step length estimation (SLE) methods fall short in providing continuous and accurate measurements across diverse populations. Conventional techniques like camera-based systems and instrumented gait mats offer high accuracy but are limited to snapshots of gait, neglecting the temporal dynamics of walking. Continuous monitoring is becoming increasingly important for capturing information unavailable through traditional methods, revealing valuable insights into gait variability and real-world functional capacity. Inertial measurement units (IMUs) are promising candidates for continuous monitoring due to their portability and cost-effectiveness. However, various approaches to SLE using IMUs, including double integration, kinematic modeling, and regression methods, have shown limitations. Double integration suffers from drift and requires zero velocity updates. Kinematic modeling often necessitates calibration. Previous machine learning (ML) approaches have shown potential but were often limited by small, homogeneous datasets lacking generalizability. Several studies attempted to estimate step length and walking speed using smartwatches or IMUs, employing various ML models like linear regression, Gaussian process regression, support vector machines, and neural networks. However, these models often lacked the diversity and scale of datasets necessary to ensure robustness and applicability to diverse populations, including older adults and patients with neurological conditions.
Methodology
This study utilized data from three projects: V-TIME, ONPAR, and MS-Watch. The V-TIME dataset comprised 149 patients with Parkinson's disease (PD), 27 with mild cognitive impairment (MCI), and 81 older adults (OA), all with a history of multiple falls. The ONPAR dataset included 75 PD patients and 38 healthy older adults. The MS-Watch dataset contained 61 multiple sclerosis (MS) patients and 41 healthy young adults. Participants performed 1-minute gait tests at comfortable, fast, and dual-task (cognitive) speeds. A lower-back mounted Opal IMU recorded 3D acceleration and gyroscope data at 128 Hz, while the Zeno Walkway system served as the gold standard for step length measurement. A total of over 83,000 steps were analyzed. Data preprocessing involved low-pass filtering (20 Hz cutoff) and step segmentation using a previously described algorithm. Feature extraction yielded 34 features, including FFT coefficients of acceleration signals, energy measures, and double-integrated acceleration values. Feature selection employed a stepwise approach. Several machine learning models were tested (linear regression, regression tree, SVM, KNN, XGBoost) along with a biomechanical inverted pendulum model. A fivefold cross-validation was used for model evaluation, with the V-TIME dataset serving as the test set and the other two as validation sets. Model performance was assessed using RMSE, relative error (RA), and ICC(2,1). Bland-Altman analysis determined the limits of agreement. The effects of averaging consecutive step estimations were investigated to enhance accuracy. A non-segmented model using fixed-time windows was also explored for real-time implementation. Statistical analysis included ANOVA and correlation tests.
Key Findings
The XGBoost model yielded the most accurate step length estimations, outperforming other ML models and the inverted pendulum model. For single-step estimations, the XGBoost model achieved an RMSE of 6.08 cm and an ICC(2,1) of 0.89. This RMSE improved significantly when averaging consecutive steps: 5.21 cm for 3 steps, 4.98 cm for 5 steps, and 4.79 cm for 10 steps (ANOVA, F = 23.0, p = 4.8e-6). The averaging technique, while enhancing accuracy, sacrifices the ability to analyze step-to-step variability. The model showed the highest RMSE for PD participants (6.64 cm) and the lowest for MCI participants (5.27 cm), with a possible explanation being the large number of PD participants in the training set and the smaller sample size and higher variability within the MCI group. Analysis across different gait conditions revealed the lowest RMSE during comfortable walking (5.70 cm) and the highest during fast walking (6.72 cm), suggesting sensitivity to walking speed. The model's generalizability was confirmed through validation on independent datasets (ONPAR and MS-Watch), demonstrating its robustness across diverse populations. A non-segmented model, trained on fixed-time windows, eliminated the need for step segmentation, allowing for real-time implementation. While achieving comparable gait speed RMSE to the original model, it lacks the ability to estimate step length directly.
Discussion
This study demonstrates that combining a single, conveniently placed lower-back IMU with a machine learning model (XGBoost) yields highly accurate step length estimations, even in individuals with impaired gait. The achieved RMSE of below 5 cm (often the MCID) for averaged steps surpasses the accuracy of many existing biomechanical and machine learning methods. While the model shows a strong linear correlation between estimated and measured step lengths, a systematic bias was observed, underestimating larger steps and overestimating shorter ones. This bias, common among many models, has potentially less impact when assessing disease progression which relies more on within-subject changes. The significant improvement in accuracy with averaging suggests a trade-off between instantaneous accuracy and the ability to capture step-to-step variability. The model's performance varied across different participant groups and gait conditions, indicating the potential for further refinement based on specific gait characteristics and walking speeds. The robustness of the model is evidenced by its consistent performance across multiple validation datasets, suggesting its generalizability across diverse populations. The simplified non-segmented model opens avenues for real-time applications.
Conclusion
This study successfully developed a highly accurate and generalizable step length estimation model using data from a single lower-back IMU and the XGBoost algorithm. This model achieves a clinically relevant RMSE of below 5 cm when averaging over multiple steps, surpassing many existing methods in accuracy and generalizability across diverse populations. While limitations exist, notably the systematic bias observed in step length estimations and the tradeoff with single step analysis and variability assessment, this work represents a significant advancement toward practical, continuous monitoring of gait in various clinical settings. Future work should focus on optimizing the model for real-world, uncontrolled environments, refining its performance at the extremes of step length, and integrating algorithms for detecting gait transitions.
Limitations
The study primarily focused on straight-line walking in a laboratory setting, potentially limiting the generalizability to real-world, more complex gait patterns. The model exhibited a systematic bias in estimating step length, underestimating larger steps and overestimating smaller ones. Averaging consecutive steps, while improving RMSE, reduces the sensitivity to measure step-to-step variability. Although tested on several populations, the limited number of participants within certain subgroups (e.g., MCI) may affect the reliability of findings for those specific groups. The model's performance varied slightly across different gait conditions (speeds and dual tasking), indicating that further improvement may be needed to enhance its adaptability in uncontrolled environments.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny