On Neural Differential Equations
read the original abstract
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.
This paper has not been read by Pith yet.
Forward citations
Cited by 30 Pith papers
-
Zero-Shot Size Transfer for Neural ODEs on Sparse Random Graphs: Graphon Limits and Adjoint Convergence
Proves O((α_n n)^{-1/2}) convergence of GNDE trajectories and adjoints to Graphon-NDE limits on sparse random graphs, with DTO/OTD consistency and experimental support for zero-shot transfer.
-
Accelerating Simulation and Optimisation of Cyclic Adsorption Processes with Differentiable Programming
A JAX-based differentiable model of pressure vacuum swing adsorption accelerates cyclic steady-state simulation by 20x via Newton iteration and produces a better Pareto front with IPOPT than NSGA-II in two orders of m...
-
High-dimensional inverse design of inertial fusion implosions via differentiable simulation
Differentiable implosion modeling enables gradient-based optimization of 500-parameter laser pulses for 25 kJ direct-drive ICF implosions on OMEGA-scale targets.
-
Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models
Reinforce Adjoint Matching derives a simple consistency loss for RL post-training of diffusion models by tilting the clean distribution toward higher-reward samples under KL regularization while keeping the noising pr...
-
Learning Lindblad Dynamics of a Superconducting Quantum Processor
LIMINAL fits nested Lindblad models to tomographic data and uses likelihood-ratio tests to identify minimal dynamics for a five-qubit superconducting processor, supporting three-local Hamiltonian terms and two-local d...
-
Is Flow Matching Just Trajectory Replay for Sequential Data?
Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented...
-
Pathwise Learning of Stochastic Dynamical Systems with Partial Observations
A neural path estimation approach learns the filtering posterior path measure for stochastic dynamical systems from noisy partial observations by solving a variational stochastic control problem based on the pathwise ...
-
AMIGO: a Data-Driven Calibration of the JWST Interferometer
AMIGO is an end-to-end differentiable forward model of JWST AMI that corrects detector systematics to recover high-precision astrometry and detect close high-contrast companions.
-
Scattered wave functions and worldline instantons for particle production in curved spacetime
Extends scattered-wave-function and open-worldline-instanton methods to multidimensional curved spacetimes and demonstrates agreement on 2D metric examples.
-
Online forecast reconciliation using linear models
A framework for online forecast reconciliation is developed via multivariate linear models on graph hierarchies, ridge regression, and recursive least squares, with a demonstration on district heating load data.
-
Enhancing LLMs for Graph Tasks via Graph-aware LoRA Generation
GaRA generates task-specific LoRA weight updates conditioned on graph structures to enable better whole-graph encoding in LLMs for zero-shot graph learning.
-
Continuous-Time Probabilistic Correctors for Uncertainty-Aware Physics-Based Spacecraft Trajectory Forecasting
A Latent NCDE-based continuous-time probabilistic corrector wrapped around deterministic physics propagators like GMAT improves forecast accuracy and produces sharp calibrated full-covariance uncertainty estimates on ...
-
Stochastic Lifting for Generating Trajectories of Stochastic Physical Systems
Stochastic Lifting adds random labels to training transitions to train a regression model that generates diverse stochastic trajectories without collapsing to mean predictions.
-
Neuro-Inspired Inverse Learning for Planning and Control
The Inverter framework formalizes inverse learning to generate coherent multi-step trajectories, outperforming offline RL and diffusion baselines on D4RL maze tasks by 24% on average with 10-100x less inference time w...
-
Learning partially observed systems with neural Hamiltonian ordinary differential equations
NHODE framework learns partially observed dynamical systems by combining Hamiltonian neural networks with neural ODEs, enforcing energy conservation and improving long-horizon stability over data-driven baselines on m...
-
Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models
Derives RAM, a reward-adjusted consistency loss extending diffusion pretraining regression to efficient KL-regularized RL post-training, achieving peak rewards up to 50x faster than Flow-GRPO on Stable Diffusion 3.5M.
-
Neural Stochastic Processes for Satellite Precipitation Refinement
NSP model fuses satellite and gauge data with neural processes and SDEs, outperforming 13 baselines and JAXA's operational product on a new 43k-sample US benchmark across six metrics.
-
Neural CDEs as Correctors for Learned Time Series Models
Neural CDEs serve as correctors that reduce error accumulation in multi-step forecasts from learned time-series models across synthetic, physics, and real-world data.
-
Multiqubit Rydberg Gates for Quantum Error Correction
Global multiqubit Rydberg gates enable break-even measurement-free QEC and lower-shuttling Floquet codes in neutral-atom hardware.
-
A Weak Penalty Neural ODE for Learning Chaotic Dynamics from Noisy Time Series
The Weak Penalty Neural ODE uses a weak form loss to filter noise and learn stable chaotic dynamics from noisy observations.
-
Estimating Parameter Fields in Multi-Physics PDEs from Scarce Measurements
Neptune infers spatiotemporal parameter fields in PDEs from as few as 45 sparse measurements using independent coordinate neural networks, outperforming PINNs and neural operators with lower errors and better extrapolation.
-
A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling
ShockCast is a two-phase ML method that predicts adaptive timestep sizes to model high-speed flows with shocks more efficiently than fixed-step approaches.
-
Frequency-Domain Neural ODEs for Modeling Non-Linear Dynamical Systems
FNODE projects Neural ODE dynamics into the frequency domain via FFT and reports better generalization and convergence stability than GRUs, LSTMs, and ANODE on Lotka-Volterra, forced Duffing, Van der Pol, and Lorenz systems.
-
Theory of learning of high-dimensional controlled non-linear dynamical systems (I): models and methods
Introduces models for neural ODEs trained with online SGD and derives their high-dimensional learning curves via dynamical mean field theory.
-
Software Between Quantum and Machine Learning -- And Down to Pulses
A JAX-based framework extending quantum machine learning to pulse-level control with composable ansatzes, end-to-end optimization, and Fourier diagnostics.
-
The Physical Limit of Neural Hypoxia Detection in the Black Sea from Satellite Observations
Neural networks can detect 38% of summer hypoxic events shelf-wide from satellites with 47% precision, but only within the homogeneous mixed layer.
-
The Physical Limit of Neural Hypoxia Detection in the Black Sea from Satellite Observations
A neural network trained on model data detects 38% of summer hypoxic events shelf-wide from satellite observations with 47% precision, but only within the homogeneous surface mixing layer.
-
Graph Neural Ordinary Differential Equations for Power System Identification
MPG-NODEs identify power system dynamics more flexibly than standard neural ODEs by using graph message passing, enabling transfer learning for adding or removing lines and units.
-
Physics-constrained identification of graph-based thermal networks for spacecraft digital twins
A physics-constrained inverse-problem framework identifies graph-based lumped-parameter thermal models from temperature measurements for spacecraft digital-twin applications.
-
Software Between Quantum and Machine Learning -- And Down to Pulses
Introduces a JAX-based framework for pulse-level QML with composable ansatze, end-to-end pulse optimization, and Fourier-analytic diagnostics.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.