Computer Science
Al Pontryagin or how artificial neural networks learn to control dynamical systems
L. Böttcher, N. Antulov-fantulin, et al.
Discover Al Pontryagin, a pioneering neural ordinary differential equation framework developed by Lucas Böttcher, Nino Antulov-Fantulin, and Thomas Asikis. This innovative approach effectively learns control signals for steering complex dynamical systems towards desired states, showcasing remarkable potential in solving tough optimization challenges.
~3 min • Beginner • English
Introduction
The study addresses how to efficiently control complex dynamical systems under practical constraints such as limited control energy and costs. Classical control theory defines controllability as the ability to steer a system from any initial state to any target state within finite time, with applications spanning quantum devices, cellular regulatory networks, power grids, financial systems, and epidemic management. Foundational results for linear systems include Kalman’s rank condition and the Popov-Belevitch-Hautus test. Network-focused structural controllability frameworks identify driver nodes but face challenges: determining the minimum driver set can be NP-hard; control signal design is unspecified and may be impractical; and with nodal self-dynamics, a single time-varying input may suffice, challenging prior results. Optimal control traditionally proceeds via Pontryagin’s maximum principle (necessary conditions) or the Hamilton-Jacobi-Bellman equation (necessary and sufficient), but both can be intractable in complex, high-dimensional settings. Approximate dynamic programming and neural-network-based methods have been proposed to approximate value functions or solve differentiable optimal control, but often require smoothness and are limited in scale. Physics-informed neural networks introduce Lagrangian/Hamiltonian priors and have been used for modeling and control of partially known systems. This work proposes AI Pontryagin, a neural ODE-based approach that learns control inputs to reach target states without explicitly solving maximum principle or HJB equations and without explicitly including control-energy terms in the loss. The central hypothesis is that AI Pontryagin can approximate optimal control policies and energies via an implicit energy regularization emerging from ANN initialization and gradient-based training, enabling control of analytically intractable, high-dimensional systems.
Literature Review
The paper surveys key control-theoretic and network-control advances: Kalman controllability and the PBH test for linear systems; structural controllability of complex networks and driver node identification via maximum matching; limitations stemming from NP-hardness of minimal driver sets, practical infeasibility of signals, and the impact of nodal self-dynamics. It reviews optimal control via Pontryagin’s maximum principle and HJB, noting typical intractability and the development of approximate dynamic programming. It summarizes neural approaches: ANN approximations of HJB value functions; differentiable programming methods solving maximum principle variants with requirements of twice-differentiability and demonstrated on relatively small systems; and physics-informed neural networks leveraging Lagrangian/Hamiltonian structures for modeling and control. Against this backdrop, the authors position AI Pontryagin as a complementary, scalable control framework that avoids explicit adjoint or HJB solutions and explicit energy regularization while approximating optimal energy.
Methodology
AI Pontryagin formulates control of networked dynamical systems as learning a time-dependent control signal with a neural network within a neural ODE framework. A system with state x(t) is governed by ẋ = f(x, u), with initial state x(0) and target x* at time T. The control input u(t) is represented by an ANN û(t; w), with parameters w. Training minimizes a task-specific loss J over x(T) without an explicit energy term, typically the mean-squared error to the target, J(x(T), x*) = 1/2 ||x(T)−x*||^2. Gradients ∇w J are computed via automatic differentiation through the time-unfolded ODE solver, and parameters are updated by gradient descent/ADAM. The ODEs are integrated using the Dormand–Prince (DOPRI) method with adaptive step sizes.
Implicit energy regularization: The authors analyze why the learned controls resemble optimal energy-minimizing controls despite no explicit energy term. Linearizing the effect of a weight update Δw on the control yields û(t; w^{n+1}) ≈ û(t; w^{n}) − η J J^T ∇u J, where J is the Jacobian ∂û/∂w. Thus, a gradient descent in w induces a gradient descent in û, and with small initial controls and learning rates, training produces go-with-the-flow control trajectories that follow the system’s vector field, implicitly minimizing control energy.
Linear system benchmark: For linear dynamics f(x,u)=Ax+Bu, the method is compared against analytical optimal control (OC) inputs that minimize control energy via the controllability Gramian W(T). A two-state system with A=diag(1,1), B=(1,0)^T, x(0)=(1,0.5)^T, x*=(0,0)^T, T=1 is used to illustrate convergence of AI Pontryagin trajectories and energies to OC.
Nonlinear Kuramoto oscillators: The Kuramoto model with phases θ_i and intrinsic frequencies ω_i is considered: θ̇_i = ω_i + f_i(Θ,u), with f_i(Θ,u)=K u_i(t) Σ_j A_{ij} sin(θ_i−θ_j). Natural frequencies and initial phases are normally distributed (mean 0, std 0.2). The synchronization target enforces θ_i(T)−θ_j(T)=0 on edges. A subcritical coupling K=0.1 K* (K* from graph Laplacian properties) is used so that u_i(t)>1 is needed for synchronization. For global control u_i(t)=u(t), AI Pontryagin minimizes J1(Θ(T))=1/2 Σ_{ij} A_{ij} sin^2(θ_i(T)−θ_j(T)) (no energy term) and is compared to an adjoint-gradient method (AGM) derived via Pontryagin’s principle that updates u with a step involving the adjoint λ and an energy regularization parameter β.
Experiments and comparisons: Networks include a complete graph, Erdős–Rényi G(N,p), a square lattice (no periodic boundary), and a Watts–Strogatz network, typically with N=225 and T=3. The order parameter r(t)=N^{-1} Σ_i cos(θ_i−θ_1) measures synchronization. Hyperparameters are tuned so both methods reach similar synchronization and energy levels. A larger-scale experiment is conducted on a square lattice with N=2500 and T=0.5 to assess scalability and runtime.
Alternative targets: A second loss J2(θ(T))=(1/N) Σ_i |θ_i(T)−π/4|^2 is used to steer oscillators toward specified non-synchronized phase targets (e.g., bimodal −π/4 and π/4 patterns) demonstrating flexibility.
Implementation details: Implemented in PyTorch with neural ODE solvers. Two-state system ANN: single hidden layer with 6 ELU units; linear output for one control; Kaiming uniform initialization; ADAM optimizer with learning rate 0.02; 40 time steps. Kuramoto experiments (synchronization, Fig. 4): ANNs use stochastic gradient descent; 1 hidden layer with 2 neurons; ELU activations; 2 training epochs; bias at each node; initial weights 1e−3; learning rates per graph (examples): complete and Erdős–Rényi (η=0.4, γ=5), square lattice (η≈0.32, γ=25), Watts–Strogatz (η≈0.31, γ=15). Large-scale square lattice N=2500 uses η=0.0125, γ=0.5. For the non-synchronization target (J2), a Kalman initialization and 16 ELUs in one hidden layer are used. Runtimes are measured over 50 runs using Python timeit. Data and code are publicly available at https://github.com/asikist/nnc.
Key Findings
- Linear systems: AI Pontryagin produces control trajectories and control energy nearly identical to analytical optimal control for a two-state linear system. Energy evolution E_t[u] of AI Pontryagin aligns closely with OC. The method exhibits an implicit energy regularization without explicitly including ||u||^2 in the loss.
- Induced descent evidence: Positive correlations are observed between weight changes and control changes during training: correlation coefficient ~0.96 for the first 1000 epochs (mean ~0.76), consistent with an induced gradient descent in control space.
- Kuramoto synchronization (N=225, T=3): Across a complete graph, Erdős–Rényi, square lattice, and Watts–Strogatz networks, AI Pontryagin reaches synchronization slightly faster than the adjoint-gradient method (AGM) while achieving similar control energies (normalized energy trajectories in similar ranges). Order parameter r(t) increases comparably or faster for AI Pontryagin.
- Scalability (N=2500, square lattice, T=0.5): AI Pontryagin matches AGM performance with E^{AIP}[u]/E^{AGM}[u] ≈ 1.0045 and r^{AIP}(T)/r^{AGM}(T) ≈ 0.9999. Runtime is substantially reduced: mean 1.03 s (AIP) vs 74 s (AGM) over 50 runs, about two orders of magnitude faster.
- Flexible targets: Using an alternative loss J2, AI Pontryagin steers oscillators to prescribed non-synchronized phase distributions (e.g., bimodal −π/4 and π/4) as well as full synchronization using J1.
- Overall, AI Pontryagin approximates optimal control energies for both linear and nonlinear networked systems without explicit energy penalties, and demonstrates superior runtime compared to an adjoint-based baseline.
Discussion
The findings demonstrate that AI Pontryagin effectively steers linear and nonlinear networked dynamical systems to desired targets, addressing the central question of whether neural ODE-based control can approximate optimal control without solving adjoint or HJB equations. The close match to analytical optimal control in linear systems and to an adjoint-gradient method in nonlinear Kuramoto networks indicates that AI Pontryagin implicitly regularizes control energy through its training dynamics. This implicit regularization emerges from initializing with small control magnitudes and employing gradient-based updates that induce descent in the control space. The method’s ability to synchronize oscillators across diverse network topologies and to achieve alternative target states underscores its generality. Importantly, the significant runtime gains over adjoint-based methods suggest practicality for high-dimensional systems where traditional optimal control is computationally prohibitive. These results are relevant to applications with stringent energy or resource constraints, providing a scalable and versatile framework that complements classical optimal control.
Conclusion
AI Pontryagin, a neural ODE-based control framework, learns time-dependent control signals that drive complex dynamical systems to target states without explicitly solving Pontryagin’s maximum principle or HJB equations and without explicit energy terms in the loss. It approximates optimal control energies in linear systems and matches adjoint-gradient performance on nonlinear oscillator networks, while being substantially faster and flexible enough to handle diverse targets. The study contributes an analytically motivated explanation for implicit energy regularization via induced gradient descent in control space. Future work may apply AI Pontryagin to quantum control tasks to enhance robustness, and to power-grid problems involving synchronization preservation during cascading failures. Integrations with physics-informed neural networks may enable learning and control for partially unknown dynamics.
Limitations
The approach assumes knowledge of the system’s dynamical model, initial state, and desired target state, and its implicit energy regularization relies on appropriate initialization (small initial controls) and learning rates. Empirical evaluations focus on a linear two-state example and Kuramoto oscillator networks with specific topologies and parameter settings; broader classes of nonlinear systems and constraints are not explored here. Hyperparameters are tuned to achieve comparable synchronization and energy levels with the adjoint baseline. Robustness analyses and additional details are referenced in the Supplementary Information.
Related Publications
Explore these studies to deepen your understanding of the subject.

