Computer Science
Data-driven modeling of interrelated dynamical systems
Y. Elul, E. Rozenberg, et al.
Dynamical systems appear across natural and engineered domains (atmospheric convection, physiological processes, financial markets). With increased computational power and data availability, machine learning—especially deep learning—has become central for modeling non-linear dynamics directly from data. Koopman spectral theory provides a powerful lens by representing non-linear dynamics via a linear operator (the Koopman operator, KO) acting on lifted observables, enabling prediction, analysis, and control using linear-operator tools. Prior machine learning approaches largely address single-system scenarios, missing opportunities to exploit shared information when multiple related systems exist. In many real-world settings (e.g., patient-specific medicine), individual datasets are limited, necessitating principled sharing across similar systems. Chaotic systems (e.g., Lorenz) often permit only local finite-dimensional linearizations; still, various neighborhoods may share structure enabling improved global approximations when learned jointly from multiple trajectories. This work studies multi-system (or multi-neighborhood) settings where systems have similar or identical dynamics, aiming to exploit common structure while retaining system-specific behavior. The authors formalize linearly interrelated dynamics: there exists a shared coordinate transformation that globally linearizes multiple systems, each having distinct linear dynamics in the shared representation. They propose to learn a single shared encoder-decoder (observable space) across systems, and per-system linear propagation operators, enabling the use of shared information while preserving system-specific dynamics. They argue single-system models fail to learn effectively across multiple systems, and motivate a new framework (MIDST) that jointly approximates Koopman operators with shared components, improving forecasting accuracy, statistical fidelity, and data efficiency, including faster adaptation to new systems with far less data.
The paper builds on Koopman operator theory for representing non-linear dynamics via linear action on observables, referencing foundational and applied works on Koopman spectral methods and data-driven approximations (e.g., DMD/EDMD, mpEDMD). Prior deep learning approaches include autoencoder-based universal linear embeddings (ULE) and consistent Koopman autoencoders (CK). Time-delay embeddings and Hankel-based methods have been used to approximate KOs. However, existing methods predominantly target single-system settings or fuse multiple measurement modalities of a single system rather than jointly learning multiple different but related systems. The authors note that naive multi-system training (fully shared model) fails to accommodate system variability, while separate per-system models fail to leverage shared structure. They contrast their approach with previous works on joint linear dynamics for multi-modal observations of the same system and position MIDST as a method for learning shared embeddings with per-system linear dynamics, explicitly addressing multi-system interrelated dynamics.
Framework (MIDST): Learn a shared autoencoder (AE) mapping states x ∈ R^n to a latent representation z ∈ R^k via encoder φ and decode back via ψ. In the latent space, model each system m with its own linear propagator K_m, factorized as K_m = A B_m C, where A and C are shared across systems and B_m is system-specific. This deep matrix factorization (DMF) induces implicit rank regularization, enabling richer yet regularized models. Two operation modes:
- Direct mode: x_{t+1}^m = ψ(K_m φ(x_t^m)) = ψ(A B_m C φ(x_t^m)).
- Residual mode: propagate temporal differences to improve stability and long-horizon behavior: x_{t+1}^m = ψ((K_m Δ + I) φ(x_t^m)) = ψ(φ(x_t^m) + A B_m C φ(x_t^m)). Here Δ denotes the temporal difference operator. Training across multiple systems jointly learns the shared encoder-decoder (φ, ψ) and matrices A, C; per-system B_m captures system-specific dynamics. The factorization aids sharing and provides rank regularization so k can be set large while the effective rank adapts to data. Evaluation: Predictions are assessed via Mean Absolute Scaled Error (MASE). Baselines include EDMD, mpEDMD (with time-delay embeddings), ResNet (temporal convolutional configuration), Universal Linear Embedding (ULE), and Consistent Koopman (CK). Single-system models are extended to multi-system settings by sharing the AE across systems while keeping per-system K_m (J-ULE, J-CK). Ablation on sharing schemes considers: disjoint per-system models; single fully shared model; partially shared (AE, A, C shared; only B_m system-specific). Tasks and data:
- Chaotic attractors (Lorenz): Generate multiple trajectories with identical dynamics but different initializations; train/val/test split with gaps to avoid leakage; predict next H steps given T=64 past states. Study how performance scales with number of trajectories and training size, and adaptation to new trajectories/systems with limited samples by training only B_m versus retraining K or full model. Analyze effective rank of K and Lyapunov stability (LLE).
- Arrhythmogenic treatments: Model QT/QTc dynamics for 21 subjects under five anti-arrhythmic treatments; each treatment is a system. Given six initial measurements, predict 10-step future QT and RR, derive QTc, and classify if any future QTc ≥ 460 ms. Evaluate MASE, F1, precision/recall/specificity, and Epps-Singleton goodness-of-fit between predicted and true QTc distributions.
- Sea surface temperature (SST): Predict weekly SST at 50 locations (50 systems) using past T=16 weeks of 16-adjacent-location vectors; evaluate one-step forecasts across locations. Compare MIDST-R versus MIDST-Single and other baselines. Theoretical context: The Methods detail Koopman operator basics, the notion of shared dynamics (commuting systems admit shared eigenbasis), and justify the DMF factorization’s implicit regularization. Residual-mode interpretation relates to time-delay/Hankel representations, improving stability for long horizons when spectral radius ≤ 1.
- Multi-trajectory Lorenz (identical dynamics): Jointly learning multiple approximations with MIDST yields order-of-magnitude lower forecasting error (MASE) than baselines; MIDST-R improves monotonically as the number of trajectories increases, whereas single shared models degrade with more trajectories (Fig. 1a, S2).
- Data efficiency and adaptation: When adapting to a new trajectory/system, training only the system-specific B_m while keeping φ, ψ, A, C fixed significantly outperforms training a new K with fixed AE or training an entirely new model. For long-horizon autoregressive forecasts (H=20), pre-trained AE + dynamics (training B only) reduces sMAE by 52.1% vs. training a new full model and by 71.7% vs. training a new K with fixed AE (Fig. S3, S5).
- Effective rank: Only MIDST’s effective rank of K increases with additional trajectories, aligning with improved performance; other methods remain invariant (Fig. S4).
- Stability: MIDST exhibits more stable dynamics approximations and produces an LLE closer to the true Lorenz LLE (λ1 ≈ 0.9056) than J-CK (Fig. S9).
- Heterogeneous systems: Across groups with different Lorenz parameters and across unrelated chaotic systems (Sprott, Wang–Sun), MIDST outperforms other methods (Figs. S6, S7). Performance improves with latent dimension k up to a plateau (Fig. S8).
- Arrhythmogenic treatments (stochastic dynamics/statistical fidelity): MIDST-R achieves large gains in classification metrics for predicting future QTc prolongation events: F1-score improvements of 292% over ResNet and 208% over J-CK; 217% higher specificity than J-ULE (Fig. 4b). ES goodness-of-fit test indicates MIDST-R is the only method whose predicted QTc distribution is not significantly different from ground truth (p=0.43), whereas others are rejected (p ≤ 0.05). MIDST-R also outperforms MIDST-Single by 60% in F1 (Fig. 4a) and passes ES where MIDST-Single does not (p = 0.02 rejection for single model).
- Sea surface temperature: MIDST-R maintains correct phase and amplitude across 50 locations; MIDST-Single exhibits systematic phase shifts and amplitude bias. MIDST achieves a 25.3% average MASE reduction relative to MIDST-Single and an order-of-magnitude improvement over EDMD/mpEDMD (Fig. 5, S12).
The study addresses whether and how approximations of one system’s dynamics can improve those of another when systems are interrelated. Findings show that finite-dimensional Koopman-based approximations are highly trajectory-dependent, even for identical underlying dynamics, making naive single shared models ineffective. By learning a shared observable space and partially shared linear dynamics (A, C) while keeping a compact system-specific component (B_m), MIDST exploits common structure and preserves system-specific variability. This improves forecasting accuracy, long-horizon stability, statistical fidelity in stochastic settings (ECG/QTc), and data efficiency for adapting to new systems, thereby directly addressing the challenge of limited data per system. The residual mode enhances stability and long-horizon behavior, connecting to time-delay representations. Collectively, the results demonstrate that explicit multi-system joint learning of interrelated dynamics provides consistent benefits across chaotic, biomedical, and climate applications, and uniquely improves as more related systems are incorporated.
The paper introduces MIDST, a framework for jointly modeling interrelated dynamical systems via a shared autoencoder and factorized per-system linear propagators (K_m = A B_m C), including a residual-dynamics mode. Across diverse benchmarks, MIDST outperforms EDMD/mpEDMD, ResNet, ULE/J-ULE, and CK/J-CK in forecasting accuracy, stability, statistical fidelity, and data efficiency for new-system adaptation. It is the only approach whose performance improves monotonically with the number of related systems. These results suggest that explicitly leveraging shared structure among systems is key for robust, scalable, data-driven modeling of non-linear dynamics. The authors highlight that while exact finite-dimensional KOs may not exist globally, practical rank-regularized approximations learned jointly across systems can provide superior predictive and statistical performance. They anticipate that MIDST will enable better modeling of real-world non-linear systems across domains.
- Exact finite-dimensional Koopman representations often do not exist globally (e.g., mixing chaotic systems like Lorenz); the method relies on finite-dimensional approximations that are locally valid and trajectory-dependent.
- Stability analysis: Due to reliance on deep neural networks, rigorous analytical stability guarantees are impractical; stability is assessed empirically (e.g., via LLE) rather than proven.
- Implicit assumptions: The DMF factorization (K = A B C) induces implicit rank regularization without enforcing commutation assumptions; convergence to minimal nuclear norm is not guaranteed.
- Baseline applicability: Some analytical baselines (EDMD/mpEDMD) are unsuitable for very short trajectories or heterogeneous-system experiments and were excluded in those settings.
- Model selection: Performance depends on representation dimension k, which must be set sufficiently large; improvements plateau beyond a problem-dependent threshold.
Related Publications
Explore these studies to deepen your understanding of the subject.

