Biology

Learning interpretable dynamics of stochastic complex systems from experimental data

T. Gao, B. Barzel, et al.

Discover how the Langevin Graph Network Approach (LaGNA) revolutionizes the inference of stochastic differential equations from empirical data for complex networks. Developed by Ting-Ting Gao, Baruch Barzel, and Gang Yan, this innovative method outshines existing techniques, providing critical insights into bird flock dynamics and tau pathology in mice.

00:00

Playback language: English

Index

Introduction

Complex systems, from cell migration to bird flocking, exhibit both nonlinearity and stochasticity. Stochasticity enhances adaptability, information processing, and robustness. While observable, the underlying dynamics of these systems are often elusive. Stochastic differential equations (SDEs) offer a powerful framework for modeling such systems, but conventional approaches suffer from limitations like predefined forms and assumed parameters. The increasing availability of empirical data (network topologies and node activities) provides an opportunity to infer hidden SDEs directly from observations. This field leverages artificial intelligence for scientific exploration, with progress made in identifying ODEs and PDEs for simpler systems. However, methods for effectively addressing stochasticity in large networks remain limited, with a focus primarily on prediction rather than inferring the underlying SDEs. Most methods are validated on simulated systems, lacking demonstration on real, unknown stochastic systems. This research addresses the challenge of inferring coupled SDEs from network topology and node activity series data, aiming to uncover the hidden stochastic dynamics of complex systems.

Literature Review

Existing literature demonstrates progress in data-driven discovery of governing equations. Several methods successfully identify ordinary differential equations (ODEs) and partial differential equations (PDEs) for single- and few-body nonlinear systems. Other approaches focus on ODEs for large networks. However, these methods often fail to adequately address the inherent stochasticity found in many real-world complex systems. Previous attempts at learning stochastic dynamics have primarily focused on predicting future system behavior, neglecting the crucial task of inferring the underlying stochastic differential equations (SDEs). Furthermore, validation of these methods often relies on simulated systems with known ground-truth dynamics, limiting their applicability to real-world scenarios with unknown underlying mechanisms. This paper addresses these limitations by proposing a novel method capable of inferring the underlying SDEs of complex systems from real-world data, showcasing its effectiveness through applications to natural systems.

Methodology

The Langevin Graph Network Approach (LaGNA) is proposed. LaGNA incorporates a message-passing mechanism using three neural network modules: a self-dynamics simulator (*f*()), an interaction dynamics simulator (*g*()), and a diffusion simulator (*φ*()). These modules separate dynamical sources within nodal activity data. The message-passing mechanism is guided by the network's topology (adjacency matrix **A***ij*). The system's state evolution is described by the equation: **x***i*(*t* + *dt*) = **x***i*(*t*) + (*f**i*(**x**(*t*)) + *g**i*(**x**(*t*), **A***ij***x**(*t*))) *dt* + Φ(**x**(*t*)) *d* **W***i* where **x***i*(*t*) represents the *d*-dimensional state of node *i* at time *t*, and **W***i* is a *d*-dimensional Wiener process. Training involves maximizing the expectation of the log-likelihood of the next-step state given the current state and time step (Equation 2). The second stage involves penetrating the neural network modules (*f*(), *g*(), *φ*()) to derive explicit mathematical expressions for each part, using pre-constructed libraries of terms and a modified two-phase approach. This enables the identification of concise mathematical expressions for self-dynamics, interaction dynamics, and stochastic diffusion, forming the final SDE. The method handles signed and weighted networks by incorporating knowledge of link types and weights. Comparisons with five state-of-the-art methods (Modified-SINDy, Two-Phase inference, SDE-net, SVISE, and SFI) using a stochastic Lorenz networked system demonstrate LaGNA's superior performance in inferring networked SDEs. For bird flocking analysis, LaGNA is extended to a second-order version, incorporating modules for self-propulsion, cohesion, and alignment, with a modified loss function (Equation 3) balancing different error terms. For tau pathology analysis, a specific SDE is inferred (Equation 5) considering neuroanatomical and spatial influences, incorporating retrograde, anterograde, and spatial diffusion terms, and heterogeneity factors.

Key Findings

LaGNA demonstrates superior performance compared to five state-of-the-art methods in inferring the stochastic differential equations governing complex networked systems. Applying LaGNA to bird flocking data resulted in an inferred SDE remarkably similar to the second-order Vicsek model, providing strong evidence that this model accurately captures real flocking dynamics. The equation inferred from the bird flock data is: $\dot{v} = s_1(v^2 + s_2^2) + s_3 + \sum_{j}(\hat{C}(r_{ij}) + \hat{A}(r_{ij})v_j) + \epsilon dW_t$ The application of LaGNA to tau pathology diffusion in mouse brains yielded a novel SDE (Equation 5) that accurately predicts tau spread, considering both neuroanatomical connections and spatial proximity. The inferred equation highlights the importance of spatial diffusion, a factor often overlooked in previous studies. This model successfully predicts tau occupation in each brain region at various time points post-injection, exhibiting high accuracy (Fig 4c,d). The model also reveals distinct pathology dynamics in mutant mice, with a pronounced inclination towards retrograde diffusion. The inferred equation for tau pathology diffusion demonstrates good generalizability across multiple datasets and different experimental conditions.

Discussion

The findings demonstrate the power of LaGNA in learning interpretable stochastic dynamics from observational data. The close resemblance of the inferred bird flocking SDE to the Vicsek model offers valuable insights into collective animal behavior. The ability to accurately predict tau pathology spread in mouse brains using the inferred SDE opens new avenues for early diagnosis and treatment strategies for Alzheimer's disease. The discovery of distinct tau pathology dynamics in mutant mice provides crucial insights into disease mechanisms. These results highlight the potential of LaGNA for advancing our understanding of diverse complex systems and facilitating downstream applications like control.

Conclusion

LaGNA provides a novel and effective method for inferring interpretable stochastic differential equations from observational data of complex networked systems. Its superior performance compared to existing methods is demonstrated through applications to both bird flocking and tau pathology diffusion. Future research could focus on addressing limitations such as data incompleteness, noise separation, simultaneous inference of topology and dynamics, and incorporating higher-order interactions.

Limitations

LaGNA has several limitations. In scenarios with missing node activity data, determining the minimal sub-network required for accurate inference is crucial. Distinguishing between intrinsic and extrinsic noise (e.g., measurement errors) is challenging; while the Kalman-Takens filter can help, further refinement is needed. Obtaining the complete network topology is not always feasible; simultaneous inference of topology and dynamics remains a challenge. The use of pre-constructed libraries might overlook certain terms; improvements in symbolic regression are needed to enhance the automation of term identification. Lastly, extending LaGNA to accommodate higher-order interactions will increase the complexity of the model and necessitates further investigation.

Related Publications

Explore these studies to deepen your understanding of the subject.

Chemistry

Extracting structural motifs from pair distribution function data of nanostructures using explainable machine learning

A. S. Anker, E. T. S. Kjær, et al.

Medicine and Health

Deep learning from "passive feeding" to "selective eating" of real-world data

Z. Li, C. Guo, et al.

Computer Science

Machine learning dismantling and early-warning signals of disintegration in complex systems

M. Grassia, M. D. Domenico, et al.

Medicine and Health

Unsupervised learning of aging principles from longitudinal data

K. Avchaciov, M. P. Antoch, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny