logo
ResearchBunny Logo
Unsupervised learning of aging principles from longitudinal data

Medicine and Health

Unsupervised learning of aging principles from longitudinal data

K. Avchaciov, M. P. Antoch, et al.

Discover the groundbreaking research from Konstantin Avchaciov, Marina P. Antoch, and their colleagues that unveils a revolutionary 'dynamic frailty indicator' (dFI) using advanced machine learning. This study reveals how dFI can predict lifespan and respond to both life-shortening and life-extending interventions, providing a crucial new marker for biological age.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses how age-related physiological changes relate to lifespan and mortality without relying on age or survival labels. The authors hypothesize that aging reflects a dynamic instability of the organism’s state near a bifurcation (tipping point), implying that fluctuations of many physiological variables are governed by a single latent order parameter. Building on this, they aim to identify a data-driven descriptor of aging from longitudinal measurements. They develop a deep learning framework combining a denoising autoencoder for nonlinear dimensionality reduction and an autoregressive model to capture longitudinal dynamics, yielding a dynamic frailty indicator (dFI) from complete blood count (CBC) data in mice. The study tests whether dFI increases with age, predicts remaining lifespan, aligns with aging hallmarks, and responds to lifespan-modulating interventions, and whether its dynamics can explain deviations from the Gompertz law (late-life mortality deceleration).
Literature Review
Prior biomarker efforts include supervised models trained to predict chronological age or mortality (e.g., DNA methylation clocks) and frailty indices that correlate with disease burden and survival in humans and animals. Principal component analyses have been used to derive biological age measures, but such linear methods may be limited by nonlinearity and noise. Theoretical work on critical dynamics suggests that near instability, system behavior is governed by a few slow order parameters (enslaving principle), offering a mechanistic rationale for dimensionality reduction in aging. Classical demography (Gompertz law) describes exponential mortality acceleration; deviations at advanced ages (mortality deceleration/plateau) have been observed in various species and motivate mechanistic models. The study situates its contribution at the intersection of these strands by proposing an unsupervised, dynamical, theory-informed biomarker from longitudinal data that ties organismal state to survival dynamics.
Methodology
Data: The training set aggregated nine Mouse Phenome Database (MPD) sources, focusing on CBC assays to maximize overlap across datasets, yielding 6,693 animals and 12 features (GR%, GR, HB, HCT%, LY%, LY, MCHC, MCH, MCV, PLT, RBC, WBC). If granulocytes were missing, they were reconstructed as GR = WBC − LY − MO and GR% = 100 − LY% − MO%. Records with missing parameters (<2%) were excluded. Analyses focused on fully grown animals >25 weeks to separate development from aging. PCA: Performed after adjusting strain effects by subtracting strain-specific youthful means. For animals >25 weeks, PC1 explained 31% of variance and correlated with age (r = 0.59, p < 1e-10); variance of PC scores increased with age, indicating stochastic dynamics. Theory: Aging modeled as stochastic dynamics of a single order parameter z obeying a Langevin-type equation ż = a z + g z^2 + f, where a > 0 implies instability with exponential growth early (z ∝ exp(at)) and faster-than-exponential later as nonlinearity dominates. The model predicts late-life mortality deceleration with a limiting mortality plateau M(t→∞) ≈ α (the Gompertz exponent). Deep learning (AE–AR): A denoising autoencoder (bottleneck 4-D latent y) compresses the 12-D CBC vector x and reconstructs it. A projector maps y to scalar dFI (z = A y). An AR(1) linear dynamics block models longitudinal evolution: z(t+Δt) = r z(t) + b + ε, trained on longitudinal pairs with Δt ≈ 26 weeks (g ≈ 0 linearized regime). An auxiliary decoder reconstructs x(t+Δt) from z(t+Δt) via φ^{-1}(B z). Constraints enforced A^T B ≈ I and B^T A ≈ I. Total loss combined AE reconstruction, future-state reconstruction, AR loss, constraint loss, and L2 regularization (weights α1 = 1, α2 ramp 0→1, α3 = 100, α4 = 0.01). Training: 600 epochs, Adam, lr = 0.001 then 0.0001 for last 200; implemented in TensorFlow. Validation datasets (excluded from training): MA0071 (cross-sectional NIH Swiss males and females across ages), MA0072 (longitudinal males at 66–130 weeks), MA0073 (endpoint euthanasia cohorts). Mortality analysis: Linked CBC longitudinal dataset (Peters4) to mortality (Yuan223); handled censoring by excluding animals with only one CBC and censoring those with ≥2 CBCs at last measure; performed Spearman rank correlations within age/sex cohorts for dFI vs lifespan; built supervised Cox PH model (HR_CBC) with age, sex, CBCs as covariates using lifelines for comparison; also compared with IGF1 serum and body weight where available. Interventions and hallmarks: Assessed associations of dFI with physiological frailty index (PFI), red cell distribution width (RDW), body weight (BW), CRP, chemokine CXCL1 (KC), and senescent cell burden using p16Ink4a-luciferase reporter mice (bioluminescence). Evaluated responses to high-fat diet (HFD) vs regular diet at week 78 (sex-specific) and to 8-week rapamycin treatment (12 mg/kg/d oral gavage) in 60-week-old C57BL/6 males vs vehicle controls, using longitudinal dFI increments ΔdFI before/during treatment to infer drug effect per the AR equation with a force term. Model evaluation: AE reconstruction (test RMSE ≈ 229.6; R² ≈ 0.55 average; best for HCT, RBC, LY; poorer for MCHC, PLT).
Key Findings
- PCA of CBC in fully grown mice (>25 weeks): PC1 explained 31% variance; PC1 correlated with age (r = 0.59, p < 1e-10); variance of PCs increased with age, indicating stochasticity. - The derived dFI (order-parameter proxy) showed strong longitudinal auto-correlation: age-adjusted dFI correlations over 14 and 28 weeks were r = 0.71 (p = 3e-07; n = 40) and r = 0.70 (p = 9e-04; n = 19) in MA0072, exceeding PC1’s autocorrelation. - dFI increased approximately exponentially with age up to around the average lifespan (~100 weeks), with growth exponent α ≈ 0.02 per week; dFI saturated near a limiting value in the oldest animals, consistent with nonlinearity-driven disintegration. - dFI predicted remaining lifespan: significant Spearman correlations between age/sex-adjusted dFI and lifespan across multiple cohorts and ages; in comparisons, dFI performed on par with or marginally better than a supervised Cox PH model (HR_CBC). IGF1 and BW were predictive mainly in younger cohorts (e.g., IGF1 in 26-week males r = −0.28, p = 0.008), but not later in life. - dFI correlated with the Physiological Frailty Index (PFI): r = 0.64 (p < 0.001), remaining significant after sex/age adjustment (r = 0.59, p < 0.001). - dFI associated with multiple hallmarks and biomarkers: strong associations with RDW and BW; with CRP (r = 0.39, p < 0.001) and CXCL1/KC (r = 0.28, p < 0.001). Semi-quantitative clustering highlighted associations with immune (white cells), metabolic/oxygen transport (red cells), and platelets subsystems. - dFI tracked senescent cell burden: p16Ink4a-luciferase total flux correlated with age (r = 0.54, p = 0.008; n = 23) and even more strongly with dFI (r = 0.69, p = 0.0003; n = 23). - Interventions: High-fat diet increased dFI in males at week 78 vs regular diet (p = 0.05; n = 7 HFD vs n = 8 RD), with no effect in females (n = 8 HFD vs n = 12 RD), mirroring lifespan effects. Rapamycin (8 weeks, 12 mg/kg/day) reduced dFI increments during treatment vs periods without treatment (p = 0.05) and attenuated body weight gain compared to controls. - Mortality deceleration: In very large external mouse cohorts, survival curves deviated from Gompertz at late life with mortality leveling towards a plateau consistent with the theoretical limiting mortality M ≈ α. Gompertz parameters estimated: males t ≈ 100.3 weeks, α = 0.0385/week, M0 ≈ 4.1×10^-3/week; females t ≈ 115.8 weeks, α = 0.0568/week, M0 ≈ 3.4×10^-3/week.
Discussion
Findings support the hypothesis that aging dynamics in mice can be captured by a single latent order parameter whose stochastic evolution drives physiological change and mortality. The unsupervised AE–AR framework identified dFI from routine CBC data, which displayed key properties of a biological age marker: exponential increase with age, strong longitudinal persistence, prediction of remaining lifespan, alignment with frailty (PFI) and inflammation (CRP, KC), and responsiveness to lifespan-modulating interventions (HFD, rapamycin). The observed saturation of dFI near end-of-life and the late-life mortality deceleration/plateau agree with the criticality-based stochastic model, linking dFI dynamics to survival patterns and explaining deviations from Gompertz law at advanced ages. The strong correlations between dFI and hematopoietic parameters suggest a central role for the hematopoietic/myeloid system in organismal aging, consonant with epigenetically driven myeloid bias in aged HSCs. The approach yields a model-based, interpretable biomarker with an explicit equation of motion, enabling sensitive longitudinal detection of anti-aging drug effects over short timescales and offering a path to generalize to other longitudinal biomedical signals.
Conclusion
The study introduces a theory-informed, unsupervised deep learning framework (AE–AR) that extracts a dynamic frailty indicator (dFI) from longitudinal CBC data. dFI operates as an empirical order parameter of aging: it increases exponentially, predicts remaining lifespan, correlates with frailty and inflammatory markers, tracks senescent cell burden, and responds to life-shortening and life-extending interventions. The model quantitatively links organismal state dynamics to survival, accounting for late-life mortality deceleration. This provides a practical tool for aging biology and pharmacology, enabling short, longitudinal trials to detect intervention effects. Future work should incorporate richer biomarker panels, increase model capacity to capture nonlinear dynamics (e.g., higher-rank AR and mode coupling), validate across strains and species, and translate the approach to human longitudinal datasets to discover actionable aging phenotypes and evaluate therapies.
Limitations
- Mortality-linked sample sizes within MPD-linked cohorts were limited, constraining detailed analysis of late-life deviations; large-cohort mortality validation used external datasets. - The AR component used a linearized (g ≈ 0) approximation; while justified for much of the lifespan, nonlinearity becomes important near the end of life and may require higher-rank/nonlinear models. - CBC-only training excluded other biomarkers due to sample size constraints, potentially limiting generality; AE reconstruction was poor for some features (e.g., MCHC, platelets) in the test set. - Strain and context variability affected feature–dFI associations (including sign differences), challenging linear models and underscoring nonlinearity; generalization across strains/environments/interventions needs further validation. - Cross-sectional vs longitudinal data imbalance necessitated architectural regularization; measurement noise and missingness imputation constraints may impact inference.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny