
Health and Fitness
Sensing leg movement enhances wearable monitoring of energy expenditure
P. Slade, M. J. Kochenderfer, et al.
Discover how a groundbreaking Wearable System can accurately estimate metabolic energy expenditure in real-time, outperforming both standard and activity-specific smartwatches. This innovative technology, developed by researchers Patrick Slade, Mykel J. Kochenderfer, Scott L. Delp, and Steven H. Collins from Stanford University, promises to transform energy balance systems in weight management and large-scale activity tracking.
~3 min • Beginner • English
Introduction
The study addresses the need for accurate, objective, and scalable monitoring of physical activity to combat physical inactivity, a major contributor to global mortality. Accurate daily energy expenditure estimation must capture both basal and active components across common activities (walking, stair climbing, running, biking) and handle time-varying bouts that constitute a large portion of daily behavior. Existing gold standards (respirometry, doubly labeled water) are accurate but impractical for everyday use; consumer wearables often rely on heart rate or wrist/trunk kinematics and show large errors, particularly for new users and time-varying conditions. The authors hypothesized that a data-driven method requiring no subject-specific calibration, using only lower-limb kinematics from IMUs and stride-segmented inputs, could estimate energy expenditure more accurately than state-of-the-art methods during common steady-state and time-varying activities. Lower-limb kinematics are expected to better reflect energy use by leg muscles and converge faster than physiological signals, enabling real-time, mobile estimation suitable for large-scale deployment.
Literature Review
The paper reviews limitations of common physical activity assessments: self-report surveys show low-to-moderate correlation and inconsistent bias versus direct measures; pedometers and smartphones provide step counts but their accuracy varies with device and speed and energy estimation from steps has high error and is limited to walking. Laboratory methods (respirometry and doubly labeled water) yield accurate steady-state or aggregate estimates but are expensive, intrusive, slow (minutes to days), and not feasible for daily monitoring; simulation-based musculoskeletal models require many sensors and heavy computation and generalize poorly. Wearable data-driven approaches combine sensors (accelerometers/IMUs, heart rate, ECG, EMG, respiration/impedance pneumography) with models and can achieve relatively low errors when trained with subject-specific data, but errors roughly double for new subjects, limiting generalizability. Commercial activity monitors and smartwatches, typically using wrist or hip acceleration thresholds and/or heart rate, report large errors (often 27–93%) for new subjects, may not capture lower-limb activity adequately, and respond slowly to changes in energy expenditure due to physiological delays.
Methodology
The study comprised four experiments and a full pipeline from sensor selection to model training and validation on new subjects and conditions. 1) Evaluating data-driven methods: Thirteen healthy adults completed 12 steady-state treadmill conditions (various walking/running speeds, sideways/backward walking, hopping, loaded walking). Ground truth steady-state energy expenditure was Steady-State Respirometry (last 3 minutes averaged). For exploratory time-varying trials (n=4), subjects alternated walking/running with a 30 s period; ground truth was Interpolated Respirometry between subject-specific steady-state values based on treadmill speed. Wearables included EMG (7 leg muscles), IMUs (7 locations: pelvis, foot, shank, thigh bilaterally), force-sensing insoles, and heart rate. Methods compared: Heart Rate Model, an Activity Monitor (ActiGraph-like counts), Musculoskeletal Model (OpenSim-based muscle energy), a Data-Driven Model (linear ridge regression using all wearable inputs segmented by stride), Per-Breath Respirometry, and Fast-Estimated Respirometry (first-order exponential fit to breaths). Leave-one-subject and leave-one-condition-out cross-validation assessed accuracy. 2) Sensor selection: The Data-Driven Model was tested with different sensor classes and IMU placements. IMUs were the best single class; two IMUs (one shank, one thigh) yielded the lowest error among IMU permutations. 3) Designing/training the Wearable System: Additional data were collected for stair climbing (stairmill at 40/60/80 steps/min) and cycling (20/70/150 W at 80 rpm) with 10–11 healthy adults, combined with prior walking/running data to train the final model. To improve robustness to IMU placement/orientation variance, synthetic training data were generated by applying random rotations to IMU signals. The Wearable System hardware included a Raspberry Pi 3b+, battery, and two Adafruit IMU boards worn on one leg (thigh and shank), costing ~$100, weighing 232 g, operating for 7.3 hours. Real-time stride detection used shank sagittal-plane angular velocity peaks (low-pass 6 Hz; ≥0.5 s between peaks) to segment strides. The system detected quiet standing (no strides ≥8 s) and estimated it from scaled basal energy expenditure (Mifflin equation scaled by training quiet-standing ratio). 4) Validation with new subjects/conditions: A diverse cohort (n=24; 15 men, 9 women; age 34.8±11.6 y; mass 74.3±13.1 kg; height 1.73±0.07 m; BMI 24.9±4.1) performed two new steady-state conditions for each activity at intermediate intensities (walk 1.0 and 1.5 m/s; run 2.5 and 3.0 m/s; stairs 50 and 70 steps/min; bike 50 and 120 W at 80 rpm) and four time-varying treadmill profiles (step and sinusoidal transitions between standing/walking and walking/running). Steady-state ground truth was averaged Steady-State Respirometry (last 3 minutes of 6-minute trials). Time-varying ground truth used Interpolated Respirometry; cumulative energy expenditure was compared to Per-Breath Respirometry. Compared methods: Wearable System (two IMUs), Smartwatch (Apple Watch Series 1), Activity-Specific Smartwatch (Apple workout modes), Heart Rate Model, Per-Breath and Fast-Estimated Respirometry. Data processing: EMG filtering (30–500 Hz bandpass, rectify, 6 Hz low-pass, normalization), insole force correction/drift handling, OpenSense/OpenSim for sagittal joint kinematics from IMUs (calibrated per condition), synchronization and stride segmentation. Data-Driven Model details: ridge regression (λ=1), inputs discretized to 30 bins per stride across IMU accelerometer/gyroscope axes from thigh and shank; features included subject height, weight, and stride duration. Training used last 50 strides per condition, standardized per feature using training means/SDs. Activity-Specific Model used perfect manual activity labels with linear models per activity. Musculoskeletal Model estimated per-muscle metabolic rates using measured kinematics and EMG-driven activations in OpenSim, validated against joint moments; limited to five strides per condition due to computation time. Smartwatch data were exported from Apple Health and converted to Watts. Performance metrics: absolute error, relative error (after subject-level bias removal), and cumulative energy expenditure error (including 3 minutes post-condition). Statistical analysis used Kruskal–Wallis and paired t tests with Bonferroni correction (α=0.0033). Usability was assessed post hoc with the System Usability Scale (SUS) and comfort using a Questionnaire for User Interaction Satisfaction-derived survey.
Key Findings
- Data-Driven Model performance (offline evaluation): 10.5% relative error during steady-state, about half the next best wearable model; reached accurate estimates quickly as inputs converged. During time-varying conditions, absolute error was 7%, about one-quarter of the next best model; physiological-signal-based methods exhibited delayed responses. - Sensor selection: IMUs were the most informative class; two IMUs (shank + thigh) achieved 13.7% relative error versus 10.5% using all sensors; a single thigh IMU yielded 16.7% error. - Wearable System hardware and operation: ~US$100 cost, 232 g, ~7.3 h battery life; real-time estimates computed in ~0.01 s on-device. - Steady-state validation on new subjects (n=24): Wearable System had 13% steady-state error, lower than all other wearable methods (paired t tests p≤1×10⁻⁶). Cumulative energy expenditure error for steady-state was 12%, significantly lower than others (38–71%; p≤2×10⁻¹⁴). It had the lowest error for the first 44 s; after that, Fast-Estimated Respirometry was most accurate among laboratory methods. - Time-varying conditions: Wearable System accurately tracked step and sinusoidal changes. Error over time was 23%, significantly less than 46–105% for other methods (p≤3×10⁻⁴). Cumulative energy expenditure error was also significantly lower (p≤7×10⁻⁴). - Overall across all steady-state and time-varying conditions: Wearable System cumulative error 13%, versus 42–86% for other methods (p≤2×10⁻²¹). In headline comparison across common activities, cumulative error was 13% versus 42% (Smartwatch) and 44% (Activity-Specific Smartwatch). A thigh-only IMU version had 19% cumulative error. - Model interpretability: Larger regression weights were assigned to gyroscope channels and to inputs in the second and fifth stride quintiles; cosine similarity between model weights and input standard deviations across activities was 0.96, suggesting the model exploits stride-phase differences to differentiate activities. - Usability: SUS score 80.9/100 (≈90th percentile of 5000-device benchmark), with high comfort ratings. - Robustness: Adding synthetic IMU rotations during training improved robustness to sensor placement/orientation. - No significant correlation of estimation error with age, height, weight, or BMI; no gender difference in steady-state absolute error.
Discussion
Findings support the hypothesis that lower-limb kinematics segmented by stride enable accurate, real-time estimation of energy expenditure without subject-specific calibration. By focusing on leg IMUs, the Wearable System captures signals closely tied to energy use by lower-limb muscles and avoids the latency inherent in physiological measures like heart rate or breath-based respirometry, improving responsiveness during time-varying activity. Stride-based segmentation creates time-invariant inputs that simple linear models can exploit, contributing to low computational cost and ready portability. The system substantially outperforms smartwatches and activity monitors, likely because those rely on wrist/trunk kinematics and/or heart rate and do not leverage stride structure; wrist placement is suboptimal for activities like stairs and cycling where wrist motion may be minimized. Surprisingly, carefully chosen two-sensor configurations outperformed more comprehensive sensor suites and heart rate fusion, underscoring the importance of principled sensor selection and placement. The method achieved consistent performance across diverse adults and across different activities, suggesting suitability for large-scale monitoring applications in health research and potential integration into energy balance systems for weight management. Laboratory respirometry remains valuable for steady-state calibration and model training, while the wearable approach enables real-world deployment.
Conclusion
The study introduces and validates a low-cost, portable Wearable System that uses two IMUs on the thigh and shank and a simple stride-segmented linear model to estimate metabolic energy expenditure in real time across walking, running, stair climbing, and cycling. It achieves markedly lower errors than state-of-the-art wearable methods, including smartwatches, for both steady-state and time-varying activities, while maintaining high usability. Contributions include: demonstrating the superiority of lower-limb kinematics and stride segmentation; rigorous sensor selection showing two leg IMUs suffice; a robust training strategy with synthetic rotations; and comprehensive validation on new subjects and conditions. Future work should refine hardware for everyday use (e.g., integrate into clothing/shoes or leverage smartphones), expand and diversify training data to cover more activities and populations, explore more expressive models when sufficient data are available, and consider additional signals (e.g., respiration frequency) to differentiate conditions with similar kinematics. Integration with caloric intake estimation could enable a practical energy balance system for personalized weight management and population-scale physical activity monitoring.
Limitations
- Training data were collected from relatively small cohorts of young, healthy subjects (n=13 for training sensor selection; n≈10–11 for stair/bike training), potentially limiting generalizability to other populations and activities. - The model requires training data for each activity to be monitored; performance degrades when estimating activities with energy expenditures far from those represented in training. - Time-varying ground truth was approximated via interpolation between steady-state respirometry values and does not capture additional energetic costs of acceleration (estimated to bias cumulative costs by ~2–4%). - Activities above the aerobic threshold and those dominated by upper-limb motion were not considered and may require additional sensors or methods. - The prototype hardware (Raspberry Pi with wired IMUs) is bulky; productization and long-term, free-living validation remain to be demonstrated. - Early trials with more complex machine learning models overfit due to limited data; larger datasets would be necessary to evaluate advanced models. - Potential susceptibility to errors in conditions with similar kinematics but different resistive loads (e.g., cycling at different resistances) if users maintain very consistent kinematics, though fixed-cadence cycling here did not show this issue.
Related Publications
Explore these studies to deepen your understanding of the subject.