logo
ResearchBunny Logo
Introduction
The field of aging research has seen significant advancements with the development of "age clocks," machine learning models that predict age from biological data such as DNA methylation, gene expression, proteomics, and metabolomics. While DNA methylation-based clocks achieve high accuracy, gene expression data offers a closer link to the aging phenotype, potentially enhancing interpretability. However, existing transcriptomic age clocks often lack interpretability, functioning as "black box" models. This research proposes a new age clock that addresses this limitation by integrating prior biological knowledge into the model's design. The approach uses a knowledge-primed artificial neural network, where gene-pathway annotations structure the network's architecture, guiding information flow and allowing pathway-based representations of molecular data to be learned. This design enables monitoring of pathway aging states, enhancing interpretability. The study uses a large transcriptomic dataset from epidermal skin samples (n=887) from the SHIP-TREND cohort, a longitudinal study well-suited for investigating natural aging progression. Skin is an ideal tissue for studying aging due to its easily observable aging phenotype and accessibility for non-invasive sampling. The model's interpretability is expected to unlock its utility in aging research and applied settings, such as cell-culture-based screening.
Literature Review
The introduction extensively reviews existing age clocks based on various biological data types, highlighting their strengths and weaknesses. It emphasizes the lack of interpretability in many existing models and cites previous work on knowledge-primed neural networks for modeling biological data, particularly the use of such networks for modeling yeast growth from transcriptomic data. The paper also references studies on the role of specific genes and pathways in aging and lifespan, setting the stage for the validation experiments conducted later in the study.
Methodology
The study employs a pathway-based artificial neural network, built upon the "Hallmark" pathway collection, consisting of 50 refined gene sets representing essential biological processes. The network architecture features an input layer for gene expression data, four hidden pathway layers (with connections restricted according to pathway affiliations), and two output layers: a main output for age prediction and an auxiliary output for pathway aging state monitoring. To improve model robustness and accuracy, an ensemble learning approach was implemented, combining the outputs of 10 individually trained networks. Model training used RNA sequencing data from 887 epidermal skin samples, split into training (70%) and testing (30%) sets. The networks were trained for 200 epochs using a loss function that combined the mean squared errors of both the main and auxiliary outputs. The model's performance was assessed using the median absolute error on the test set. In silico gene knockdowns and overexpressions were simulated to validate the model's ability to recapitulate known aging mechanisms. The effects of complex transcriptional signatures associated with various aging-related conditions (Hutchinson-Gilford progeria syndrome, photoaging, actinic keratosis, cutaneous squamous cell carcinoma, replicative senescence, and caloric restriction) were simulated to assess the model's ability to predict their impact on biological aging and associated pathways. Visual age estimations from standardized portrait images were obtained from an expert panel and correlated with the transcriptomic age predictions. A fully connected "black box" neural network was also trained for comparison.
Key Findings
The pathway-based age clock achieved a median absolute error of 4.7 years on the independent test set, comparable to existing "black box" transcriptomic clocks. Transcriptomic age estimates showed a significant association with visual age estimates. Analysis of pathway neuron activations revealed that p53 and TNFα/NFκB signaling pathways were most strongly associated with age, although most pathways showed significant age association, supporting the deleteriome hypothesis of aging. In silico gene knockdowns successfully recapitulated known associations between gene perturbations and lifespan from the literature (e.g., SIRT1 knockdown increased predicted age, while TXNIP knockdown decreased it). Systematic knockdown simulations identified several known and novel aging target genes, including HK2 (hexokinase). Analysis of pathway neuron activation revealed that genes influencing multiple pathways exerted a greater influence on age estimation. The model accurately captured the effects of complex transcriptional signatures associated with various conditions. For example, the HGPS signature caused a dramatic increase in predicted age, while caloric restriction signatures (particularly in liver and fat tissue) showed a rejuvenating effect. The model was able to identify pathway perturbations associated with these conditions, providing mechanistic insights into their effects. The model revealed the strong association between the epithelial-mesenchymal transition pathway and Hutchinson-Gilford progeria syndrome, suggesting a new avenue for research into the disease. Caloric restriction signatures showed a rejuvenation of pathways associated with ROS, peroxisome pathways, mTOR signaling and metabolism. Photoaging simulations impacted ROS, Wnt signaling and metabolic pathways. Analysis of AK and SCC signatures demonstrated a significant correlation between pathway patterns, with differences emphasizing immune signaling and JAK-STAT pathways in SCC.
Discussion
The results demonstrate that the knowledge-primed neural network effectively models transcriptomic age with comparable accuracy to "black box" models while providing valuable insights into the underlying biological mechanisms. The model's ability to recapitulate known gene-aging relationships and predict the effects of complex transcriptional signatures strongly supports its validity and utility. The identification of key pathways involved in accelerated aging conditions and pro-longevity interventions provides valuable information for future research. The age-dependent effects of caloric restriction, especially in different tissues, highlight the complex interplay of factors involved in aging. The findings regarding the epithelial-mesenchymal transition pathway in HGPS and the role of immune and JAK-STAT signaling in AK and SCC progression warrant further investigation. The overall success of the model underscores the potential of interpretable machine learning approaches to advance our understanding of aging.
Conclusion
This study successfully developed a novel interpretable age clock using a pathway-based artificial neural network. The model's accuracy is comparable to existing transcriptomic clocks, but its unique ability to reveal pathway-level insights provides significant advantages. This approach offers a powerful tool for aging research, facilitating the identification of key pathways involved in aging and allowing for the exploration of potential interventions. Future research could extend this model to other tissues and data types, further enhancing its utility and providing a more comprehensive understanding of aging at a molecular level. Investigating the identified novel candidate genes is another promising area for future research.
Limitations
While the model demonstrates strong performance, some limitations exist. The study focused on epidermal skin tissue, limiting generalizability to other tissues. The use of the Hallmark pathway collection might introduce bias due to the selection of pathways included. The in silico gene knockdowns are simulations and may not perfectly reflect the complexity of in vivo gene perturbations. Furthermore, the model's performance, while comparable to other transcriptomic age clocks, is slightly less accurate than the 'black box' model trained on the same data. This highlights a trade-off between interpretability and predictive accuracy that might need to be considered in practice.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny