Health and Fitness
Evaluation of physical health status beyond daily step count using a wearable activity sensor
Z. Xu, N. Zahradka, et al.
The study addresses whether wearable activity monitors can provide clinically relevant metrics beyond daily step count to assess physical and cardiovascular health status in free-living settings. The context is the growing use of consumer wearables (e.g., Fitbit) that record minute-level step rate and heart rate, offering far richer data than aggregate daily steps. Prior reliance on daily steps, gait speed, and clinic-based timed walk tests limits information content and temporal frequency. The objective is to extract and validate additional metrics derived from minute-to-minute step rate and heart rate to capture endurance, intensity, frequency of ambulations, fitness, free-living 6-minute walk distance (FL6MWD), device usage, and a composite physical health state (PHS). The study compares these wearable-derived metrics to clinical parameters collected at two clinic visits and uses multivariate analyses (PCA, LPA) to identify patient subgroups and assess potential clinical relevance.
The introduction summarizes evidence that higher daily step counts and ambulation parameters (e.g., gait speed, 6MWT) predict favorable outcomes, including reduced all-cause mortality and lower adverse events in hospitalized patients. It notes that while remote monitoring has been enabled by advances in wearables, clinical use has focused mostly on daily step count. Accuracy concerns exist for step detection in atypical gait and in patient populations, yet prior studies in cancer, cardiovascular disease, PAH, and multiple sclerosis show clinically relevant data from wearables. Heart rate measured by photoplethysmography generally agrees with ECG at rest/low activity, though factors like skin pigmentation may influence accuracy. In PAH, resting heart rate (RHR) below 82 BPM was associated with longer event-free survival, suggesting HR metrics may hold prognostic value. The literature indicates untapped potential in minute-level wearable data for richer health assessment.
Design and participants: Prospective, observational study of adult PAH patients who wore a Fitbit Charge HR continuously between two routine outpatient clinic visits. IRB-approved (Cleveland Clinic IRB 15–1392); written informed consent obtained. Thirty enrolled; 22 included in analysis (both clinic visits and at least 4 weeks of wearable data). Average monitoring duration was 18.4 ± 12.2 weeks (range 7–65), totaling ~3.5×10^6 minutes across 405 weeks.
Clinical data: At each visit, up to 26 parameters (19 categorical, 7 continuous) were collected, including HR measures (resting and peak), 6-minute walk distance (6MWD), right ventricular systolic pressure (RVSP), and biomarkers (hemoglobin, albumin, NT-proBNP). HRQOL scores (EQ-5D domains, EQ VAS), WHO Functional Class (physician- and patient-assessed), modified Borg dyspnea score, and symptom/organ function assessments were included.
Wearable data acquisition and preprocessing: Minute-to-minute step rate (steps per minute, SPM) and heart rate (BPM) were downloaded from Fitbit’s API. Data were segmented into weeks (Sunday 00:00 to Saturday 23:59). A minute was considered “device worn” if HR was physiological (20 BPM ≤ HR ≤ age-predicted max, HRmax = 208 − 0.7×age). Analyses focused on SR distributions (excluding SR=0), HR at SR=0 [HR(SR=0)] and HR at SR>0 [HR(SR>0)]. Distributions were fit: SR to log-normal; HR to normal; summary statistics included mean, standard deviation, skewness.
Derived wearable metrics:
- Activity maps and HR–SR relationship: Weekly scatter of HR vs SR; linear least-squares fit provided slope (BPM per SPM) and intercept (HR at SR=0). Envelopes (upper/lower) were used to characterize activity space (area).
- Fitness (PWC170 analog): Assumed SR as proxy for effort/power. For each subject, HR and SR were averaged in 20-SPM bins to obtain a linear HR–SR relationship; slope (BPM/SPM) and intercept (BPM at SR=0) were calculated per week, then averaged across weeks to estimate fitness slope/intercept.
- Ambulation metrics: Defined “ambulation” as ≥2 min sustained SR ≥60 SPM (slow walking or faster). Weekly: frequency (count), endurance (1/e parameter from exponential fit of ambulation duration histogram starting at 2 min), intensity (SD of SR distribution from log-normal fit). Ambulation product P = frequency × endurance × intensity.
- Device usage: Usage defined as fraction of minutes in a week with physiological HR (minutes worn/10,080). Maximum off-time per week estimated from the longest continuous non-wear period. Usage heat maps (hour-by-day) visualized patterns; trends estimated by exponential smoothing (α=0.3) and linear fit.
- Fraction of time inactive: Minutes with SR=0 divided by total minutes worn per week.
- Free-living 6MWD (FL6MWD): Identified 6-minute window with maximum cumulative steps per week; converted steps to distance using step length = 0.413×height (m) for females and 0.415×height (m) for males; reported weekly FL6MWD and temporal slope (after smoothing with α=0.3).
- Physical health state (PHS): PHS = FL6MWD/H6MWD, where H6MWD is predicted 6MWD for a healthy individual of same age, gender, and BMI using H6MWD (m) = 890.46 − 6.11×age + 0.0345×age^2 + 48.87×gender − 4.87×BMI (gender: 0 female, 1 male).
Subgrouping and multivariate analyses:
- Thresholding approach: Subjects divided into subgroups based on predefined or exploratory thresholds for wearable metrics (e.g., daily steps 5000; HR(SR=0) 82 BPM; HR(SR>0) 95 BPM; ambulation P 1000; fitness slope 0.15; usage 0.94; FL6MWD 320 m and 400 m). Mann–Whitney tests compared clinical parameters between subgroups; comparisons required ≥5 subjects per group; no multiple-comparison corrections.
- PCA: Conducted in MATLAB on five weekly-averaged metrics: mean HR(SR=0), SD HR(SR>0), mean SR(SR>0), SD SR(SR>0), and fraction time inactive. Data normalized to mean 0, SD 1. Included subjects with ≥10 weeks (N=20). PC1 and PC2 explained 48.6% and 30.0% of variance, respectively; across 100 runs, mean variance for PC1+PC2 was 77.5 ± 0.58%.
- LPA: Performed in R (mclust v5.4.10) on eight Fitbit metrics (steps/day, HR(SR=0), HR(SR=0) skewness, HR(SR>0), ambulation P, fitness slope, FL6MWD, usage); model selection by BIC; three groups identified.
Clinic vs wearable comparison:
- FL6MWD vs clinic 6MWD: For each week, difference Δy between FL6MWD and a linear interpolation of clinic 6MWD between visits was computed. K-means clustering with silhouette method categorized subjects as “performers” (FL6MWD ~ clinic) or “underperformers” (FL6MWD below clinic). Temporal FL6MWD slope (m/week) was estimated via smoothed linear regression.
Cohort and data: 22 PAH patients (19 females, 3 males), mean age 50.6 ± 13.4 years; 405 total weeks; ~3.5×10^6 minutes. Weekly usage ranged 0.44–0.97.
Distributions and ranges:
- Step rate (SR) histograms (excluding SR=0) were log-normal with weekly means 13.6–32.8 SPM, SD 10.2–30.3 SPM, skewness 1.19–2.55. Average daily steps ranged 1338–10,679 (mean 5729).
- HR(SR=0) means 66.2–111.8 BPM (SD 6.4–13.7 BPM; skewness −0.75 to 2.30). Average of 10 lowest HR(SR=0) per week (proxy of true RHR) ranged 50.3–69.2 BPM.
- HR(SR>0) means 78.6–121.0 BPM (mean 94.4), SD 6.5–14.0 BPM; skewness −0.57 to 1.35.
Fitness (PWC analog): Fitness slopes ranged 0.02–0.31 BPM/SPM (mean 0.15); intercepts 73–120 BPM. Higher slopes associated with broader HR access for higher SR.
Ambulation metrics: Weekly ambulation frequency 1.6–96; endurance (1/e) 2.2–7.0 min; intensity (SR SD) 10–30 SPM. Ambulation product P ranged 42.8–10,845.3 (median ~1079; mean 1910).
Device usage patterns: Usage 0.44–0.97; maximum off-time <1 h to >12 h; common patterns included overnight removal weekly for charging and near-continuous wear with brief charges.
Free-living 6MWD (FL6MWD): Subject-level weekly mean FL6MWD ranged 164–478 m (overall mean 344 m). Weekly extremes: 85.7–683.1 m. FL6MWD temporal slopes −17.9 to +10.2 m/week (mean +1.0 ± 5.4 m/week). Two groups relative to clinic 6MWD: performers (close to clinic estimates) and underperformers (below clinic), with underperformers showing daily activity below physical capacity.
PHS (FL6MWD/H6MWD): Week-1 PHS spanned ~0.25 to >1.0; 13/22 between 0.7–0.8; 5 below 0.5. Over trial, 11/22 increased and 11/22 decreased; change <10% for 15/22; normalized change <1%/week for 18/22. Five subjects maintained PHS >0.72; four remained <0.52.
PCA and LPA: PCA revealed clusters: (i) high SR means/SD and high HR(SR>0) SD (indicative of broad activity and HR range), (ii) high HR(SR=0) (higher resting HR, limited HR range accessed), (iii) high inactivity fraction. LPA (3 groups by BIC): Group 1 had high ambulation metrics (steps/day, P, FL6MWD), high HR(SR>0), and high fitness slope; Group 2 had lowest ambulation metrics and lowest HR(SR=0) and HR(SR>0) but highest HR(SR=0) skewness; Group 3 had highest HR(SR=0) and HR(SR>0), lowest HR(SR=0) skewness and fitness slope.
Correlation with clinical parameters (Pearson r):
- Albumin V1 correlated with HR(SR=0) (r=0.565) and HR(SR>0) (r=0.627).
- NT-proBNP V1 correlated with HR(SR=0) (r=0.585) and inversely with fitness slope (r=−0.585). RVSP V1 inversely correlated with fitness slope.
- RHR V1/V2 correlated with HR(SR=0), HR(SR=0) skewness, and HR(SR>0) (e.g., RHR V2 r up to 0.782 with HR metrics in matrix).
- 6MWD V1/V2 correlated with FL6MWD (e.g., r up to 0.632 at V2). Steps/day and ambulation P were not strongly correlated with continuous clinical variables.
Threshold-based subgroup differences (Mann–Whitney):
- Daily steps >5000 vs <5000 (14/22 vs 8/22): 6 significant clinical parameters; lower steps linked to lower 6MWD (visit 1), lower hemoglobin (visit 2), worse WHO FC (visit 1), more pedal edema (visit 2).
- Mean HR(SR=0) <82 vs ≥82 BPM (14/22 vs 8/22): 8 significant parameters; lower mean HR(SR=0) associated with lower clinic RHR (visits 1–2) and lower peak HR (visit 2), but more pedal edema and palpitations and worse EQ-5D usual activity and pain/discomfort at visit 1.
- HR(SR=0) skewness <1 vs ≥1 (11/22 vs 11/22): 4 significant parameters; lower skewness associated with higher clinic RHR (visits 1–2), less pain/discomfort, and better EQ-5D index at visit 1.
- Mean HR(SR>0) <95 vs ≥95 BPM (12/22 vs 10/22): 4 significant parameters; lower mean HR(SR>0) associated with lower clinic RHR (visits 1–2), lower albumin (visit 1), more palpitations (visit 1).
- Fitness slope >0.15 vs ≤0.15 (11/22 vs 11/22): 3 significant parameters; higher slope linked to lower NT-proBNP at visits 1–2 (means 188 ± 180 and 145 ± 165 pg/mL for high-slope group).
- Ambulation P >1000 vs ≤1000 (12/22 vs 10/22): 7 significant parameters; lower P associated with lower 6MWD (visits 1–2) and more pedal edema (visit 1).
- Usage ≥0.94 vs <0.94 (7/22 vs 15/22): 4 significant parameters; lower usage associated with worse WHO FC (visit 1), higher EQ VAS (more severe), and higher Borg dyspnea (visit 2).
- FL6MWD <320 m vs ≥320 m: 6 significant parameters; lower FL6MWD associated with lower clinic 6MWD (visits 1–2), more pedal edema (visit 2), worse WHO FC (visit 1), lower hemoglobin (visit 2). FL6MWD ≥400 m: associated with higher clinic 6MWD (visit 2), lower NT-proBNP (visit 2), less angina (visit 1), better WHO FC (visit 2).
Overall: Multiple wearable-derived metrics beyond daily steps captured diverse facets of physical and cardiovascular function, identified meaningful subgroups, and showed significant associations with established clinical measures and biomarkers.
The findings demonstrate that minute-level wearable data provide a multidimensional characterization of patients’ physical and cardiovascular status that extends far beyond daily step counts. Derived metrics (rest/active HR distributions, ambulation frequency/endurance/intensity, fitness slope/intercept, FL6MWD, usage, PHS) captured distinct and complementary aspects of health, as evidenced by limited intercorrelations among HR metrics and differing associations with clinical parameters. PCA and LPA revealed clinically interpretable clusters: high-activity/high-variability individuals, high resting HR/limited HR range individuals, and high inactivity individuals. The thresholding analyses linked wearable metrics to clinically relevant differences in physical performance (6MWD), symptoms (dyspnea, edema, palpitations, angina), physician-assessed functional class, and biomarkers (NT-proBNP, albumin, hemoglobin), suggesting potential surrogate markers of disease severity and functional capacity in PAH. The FL6MWD tracked clinic 6MWD and identified “underperformers,” whose free-living activity was below their clinical capacity, highlighting behavioral or environmental factors limiting daily activity. Usage metrics and patterns provided context for interpreting activity measures and may themselves signal changes in health state. Collectively, these results support integrating richer wearable-derived metrics into remote monitoring to inform risk stratification, therapeutic targeting, and follow-up intensity in PAH and potentially other chronic diseases.
This study shows that a broad set of clinically relevant metrics can be derived from consumer wearables beyond daily steps, including HR-derived rest/active distributions, ambulation characteristics, fitness slope/intercept, FL6MWD, and PHS. These metrics identified patient subgroups, correlated with key clinical measures (e.g., NT-proBNP, RVSP, RHR, 6MWD), and differentiated clinical status via threshold-based subgroup analyses. The approach is feasible in free-living conditions and can enhance telemedicine and longitudinal care by providing granular weekly signatures of physical and cardiovascular function. Future work should validate heart rate and step detection against gold standards, incorporate medication and other covariates, expand to larger, multi-center cohorts and diverse patient populations, and develop AI models to classify daily activity types and detect clinically meaningful changes in real time.
- Activity type validation: Although heart rate confirms device wear, specific activities of daily living were not independently validated; AI-based activity classification may improve interpretability. 2) Measurement validation: Fitbit heart rate and step rate were not independently validated in this cohort; baseline laboratory validation (e.g., RHR, fixed-speed ambulation) and controlled studies would strengthen accuracy assessments. 3) Covariate adjustment: Analyses did not account for medications or other covariates due to small sample size; incorporating these could refine associations. Thresholds were partly arbitrary or literature-guided, and multiple-testing adjustments were not applied. 4) Generalizability and size: Small, single-center cohort (N=22) limits generalizability; replication in larger, multi-center and varied patient populations is needed.
Related Publications
Explore these studies to deepen your understanding of the subject.

