arxiv: 2604.04920 · v1 · submitted 2026-04-06 · 🧮 math.OC · cs.LG

Recognition: 2 theorem links

· Lean Theorem

PINNs in PDE Constrained Optimal Control Problems: Direct vs Indirect Methods

Zhen Zhang , Shanqing Liu , Alessandro Alla , Jerome Darbon , George Em Karniadakis

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:59 UTC · model grok-4.3

classification 🧮 math.OC cs.LG

keywords physics-informed neural networksPDE-constrained optimal controlsemilinear parabolic equationsAllen-Cahn equationdirect methodindirect methodPontryagin optimality conditionsadjoint equation

0 comments

The pith

An indirect PINN formulation based on the optimality system preserves PDE constraints and produces more accurate controls than a direct formulation for semilinear parabolic equations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares two ways to use physics-informed neural networks for optimal control of semilinear PDEs. The direct approach minimizes the objective function while enforcing the state equation as a constraint. The indirect approach instead solves the full first-order optimality system consisting of the state equation, the adjoint equation, and the stationarity condition. Numerical tests on an Allen-Cahn control problem show that the indirect version more accurately satisfies both the PDE and the optimality requirements while also producing smoother control profiles.

Core claim

For semilinear parabolic optimal control problems the state, adjoint, and stationarity equations can be written in a form consistent with continuous-time Pontryagin conditions. When these equations are embedded directly into a neural-network loss, the resulting indirect PINN respects the underlying PDE constraint and optimality structure more faithfully than a direct PINN that minimizes the objective under the state constraint alone. Both PINN formulations exhibit an implicit regularizing effect that yields smoother controls than a classical discretize-then-optimize adjoint method.

What carries the argument

The indirect PINN formulation that trains a neural network to satisfy the coupled state-adjoint-stationarity system derived from continuous-time Pontryagin optimality conditions; this system replaces the single objective-plus-state loss used in the direct approach.

If this is right

The indirect PINN satisfies the continuous-time optimality conditions more closely than the direct PINN for the tested class of semilinear parabolic problems.
Neural-network parameterizations introduce an implicit smoothing effect on the computed control without any added regularization term.
For the Allen-Cahn equation the indirect formulation outperforms the direct one in both constraint satisfaction and approximation accuracy.
The same PINN regularization benefit appears when compared against a standard discretize-then-optimize adjoint scheme.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Deriving and embedding the adjoint equation may become the preferred route when adjoint information is already available from classical optimal-control theory.
The observed smoothing effect could reduce the need for explicit penalty terms in control problems that admit discontinuous or bang-bang solutions.
Testing the indirect formulation on higher-dimensional or strongly nonlinear PDEs would reveal whether the advantage persists beyond the semilinear parabolic setting.
Hybrid schemes that use PINNs only for the adjoint while retaining a traditional state solver could combine the regularization benefit with existing PDE infrastructure.

Load-bearing premise

The first-order optimality system for the semilinear parabolic control problem can be expressed in a form that matches continuous-time Pontryagin conditions without extra regularity requirements that neural-network solutions might violate.

What would settle it

A high-accuracy reference solution for the Allen-Cahn control problem in which the indirect PINN residual on the stationarity condition exceeds the direct PINN residual or the indirect solution deviates farther from the reference control.

Figures

Figures reproduced from arXiv: 2604.04920 by Alessandro Alla, George Em Karniadakis, Jerome Darbon, Shanqing Liu, Zhen Zhang.

**Figure 2.** Figure 2: Control profiles, shown from top to bottom: ad [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: State trajectories corresponding to the converged [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

We study physics-informed neural networks (PINNs) as numerical tools for the optimal control of semilinear partial differential equations. We first recall the classical direct and indirect viewpoints for optimal control of PDEs, and then present two PINN formulations: a direct formulation based on minimizing the objective under the state constraint, and an indirect formulation based on the first-order optimality system. For a class of semilinear parabolic equations, we derive the state equation, the adjoint equation, and the stationarity condition in a form consistent with continuous-time Pontryagin-type optimality conditions. We then specialize the framework to an Allen-Cahn control problem and compare three numerical approaches: (i) a discretize-then-optimize adjoint method, (ii) a direct PINN, and (iii) an indirect PINN. Numerical results show that the PINN parameterization has an implicit regularizing effect, in the sense that it tends to produce smoother control profiles. They also indicate that the indirect PINN more faithfully preserves the PDE contraint and optimality structure and yields a more accurate neural approximation than the direct PINN.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Indirect PINN beats direct on the Allen-Cahn control example by better enforcing the full optimality system, but the size of the edge is hard to judge without the actual error numbers.

read the letter

The main thing to know is that the indirect PINN, which trains on the state-adjoint-stationarity system, keeps the PDE constraint and optimality conditions tighter than the direct PINN that only penalizes the objective plus state equation. Both neural approaches produce smoother controls than the classical discretize-then-optimize baseline, which the authors attribute to implicit regularization from the network parameterization. That observation is the most immediately useful empirical note in the work. The derivation of the first-order conditions for the semilinear parabolic class is standard but cleanly specialized to the Allen-Cahn case, and the head-to-head comparison against the classical method gives a concrete data point on method choice. The numerical results are presented as favoring the indirect route on accuracy and constraint satisfaction. The soft spot is that the abstract gives no tables, no reported residual norms, and no description of how the post-training PDE and optimality residuals were actually computed or weighted. Without those details it is difficult to tell whether the reported advantage is robust or sensitive to loss balancing. The stress-test point about regularity for the linearized nonlinearity and stationarity condition is reasonable to check, but the paper's results appear internally consistent on the tested example, so it does not look like a fatal gap. This is a paper for people already working on PINNs or neural methods for PDE control who want a practical comparison on a standard test problem. It is not a broad theoretical advance, but the comparison is honest enough to be worth referee time so the numerical claims can be examined against code or extra tests.

Referee Report

3 major / 2 minor

Summary. The manuscript investigates the application of physics-informed neural networks (PINNs) to PDE-constrained optimal control problems for semilinear parabolic equations. It presents direct and indirect PINN formulations, derives the corresponding first-order optimality system (state, adjoint, and stationarity conditions) for an Allen-Cahn control problem in a form consistent with continuous-time Pontryagin conditions, and compares the two PINN approaches numerically to a classical discretize-then-optimize adjoint method. The key findings are that the PINN parameterization exerts an implicit regularizing effect on control profiles and that the indirect PINN more faithfully preserves the PDE constraint and optimality structure while yielding a more accurate neural approximation than the direct PINN.

Significance. If the numerical superiority of the indirect formulation is confirmed with quantitative evidence, the work would offer a practical demonstration that indirect PINN methods can better maintain structural fidelity to optimality conditions in PDE control settings compared to direct methods. The explicit derivation of the optimality system and the comparison against a classical discretize-then-optimize baseline are strengths that could help position PINNs as viable alternatives for such problems. The observed regularizing effect on controls is a potentially useful empirical observation.

major comments (3)

[Abstract and Numerical Experiments] Abstract and Numerical Experiments section: The central claim that 'the indirect PINN more faithfully preserves the PDE constraint and optimality structure and yields a more accurate neural approximation than the direct PINN' is stated without accompanying quantitative error tables, L2-norm comparisons, or reported PDE residual norms for the three methods (discretize-then-optimize, direct PINN, indirect PINN). This leaves the superiority assertion unsupported by the visible numerical evidence.
[Optimality system derivation] Section deriving the optimality system for the Allen-Cahn problem: The adjoint equation incorporates the linearized nonlinearity (factor 1-3u²) and the stationarity condition couples the adjoint to the control, but the manuscript provides no discussion of a priori bounds on the approximation error for the nonlinear term or verification that the weak form of the optimality conditions is recovered in the continuum limit from the discrete collocation enforcement. This makes it unclear whether the reported advantage is structural or an artifact of loss weighting.
[Numerical Experiments] Numerical Experiments section: Details on how post-training PDE residuals were measured (e.g., at which points, with what quadrature) and the precise baseline implementation of the discretize-then-optimize adjoint method are not provided, preventing independent assessment of the constraint-preservation and accuracy comparisons.

minor comments (2)

[Problem formulation] The notation distinguishing the state u, adjoint p, and control f could be introduced more explicitly in the problem formulation section to avoid ambiguity when transitioning between direct and indirect formulations.
[Figures] Figure captions for the control profiles and residual plots should include the specific loss weights and collocation point counts used, to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major point below and have revised the manuscript accordingly where the suggestions strengthen the presentation.

read point-by-point responses

Referee: [Abstract and Numerical Experiments] Abstract and Numerical Experiments section: The central claim that 'the indirect PINN more faithfully preserves the PDE constraint and optimality structure and yields a more accurate neural approximation than the direct PINN' is stated without accompanying quantitative error tables, L2-norm comparisons, or reported PDE residual norms for the three methods (discretize-then-optimize, direct PINN, indirect PINN). This leaves the superiority assertion unsupported by the visible numerical evidence.

Authors: We agree that explicit quantitative metrics would make the comparison more transparent. In the revised manuscript we have added a table in the Numerical Experiments section reporting L2-norm errors (relative to a high-resolution reference) for the state, adjoint, and control, together with the PDE residual norms evaluated for all three methods. These data directly corroborate the abstract claim. revision: yes
Referee: [Optimality system derivation] Section deriving the optimality system for the Allen-Cahn problem: The adjoint equation incorporates the linearized nonlinearity (factor 1-3u²) and the stationarity condition couples the adjoint to the control, but the manuscript provides no discussion of a priori bounds on the approximation error for the nonlinear term or verification that the weak form of the optimality conditions is recovered in the continuum limit from the discrete collocation enforcement. This makes it unclear whether the reported advantage is structural or an artifact of loss weighting.

Authors: The optimality system is derived in strong form from the continuous Pontryagin conditions; the PINN enforces it by collocation. We have inserted a short paragraph clarifying that, in the limit of increasing collocation points and network capacity, the discrete enforcement is consistent with the continuous strong-form conditions. A full a priori error analysis for the nonlinear term is beyond the scope of this methodological comparison paper; we have added a remark noting this as future work. revision: partial
Referee: [Numerical Experiments] Numerical Experiments section: Details on how post-training PDE residuals were measured (e.g., at which points, with what quadrature) and the precise baseline implementation of the discretize-then-optimize adjoint method are not provided, preventing independent assessment of the constraint-preservation and accuracy comparisons.

Authors: We thank the referee for this request. The revised Numerical Experiments section now states that residuals are computed on a uniform fine grid (1000 spatial points, 200 temporal points) using the composite trapezoidal rule. We have also added a precise description of the discretize-then-optimize baseline: linear finite elements in space, implicit Euler in time, and L-BFGS for the resulting finite-dimensional optimization problem. revision: yes

Circularity Check

0 steps flagged

No circularity: optimality system derived from standard Pontryagin conditions; comparisons are between independent loss formulations

full rationale

The paper recalls classical direct/indirect optimal control viewpoints, then derives the state-adjoint-stationarity system for semilinear parabolic PDEs in a form consistent with continuous-time Pontryagin conditions before specializing to Allen-Cahn. This derivation is presented as a standard recall and specialization rather than a reduction to fitted quantities, self-citations, or ansatzes imported from the authors' prior work. Numerical results compare three distinct approaches (discretize-then-optimize, direct PINN, indirect PINN) via empirical performance on PDE constraint satisfaction and accuracy; no prediction is forced by construction from the inputs, and the implicit regularization effect is reported as an observed outcome rather than a definitional tautology. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Framework rests on standard existence results from optimal control theory for semilinear parabolic PDEs; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Solutions to the state and adjoint equations exist and are sufficiently regular for the stationarity condition to be well-defined.
Invoked when deriving the indirect formulation consistent with Pontryagin conditions.

pith-pipeline@v0.9.0 · 5499 in / 1228 out tokens · 46217 ms · 2026-05-10T18:59:33.881270+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

For a class of semilinear parabolic equations, we derive the state equation, the adjoint equation, and the stationarity condition in a form consistent with continuous-time Pontryagin-type optimality conditions.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Numerical results show that the indirect PINN more faithfully preserves the PDE constraint and optimality structure

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Computational Control of Nonlinear Partial Differential Equations Using Machine Learning
math.OC 2026-04 unverdicted novelty 5.0

A physics-informed neural network method is developed to approximate controls for nonlinear PDEs, including convergence analysis and numerical experiments demonstrating good performance.

Reference graph

Works this paper leans on

18 extracted references · 1 canonical work pages · cited by 1 Pith paper

[1]

Hinze, R

M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich,Optimization with PDE Constraints, ser. Mathematical Modelling: Theory and Applications. Dordrecht: Springer, 2009, vol. 23

2009
[2]

Discretization of optimal control problems,

M. Hinze, “Discretization of optimal control problems,” inCon- strained Optimization and Optimal Control for Partial Differential Equations, E. Casas and F. Tr ¨oltzsch, Eds. Basel: Birkh ¨auser, 2011

2011
[3]

Tr ¨oltzsch,Optimal Control of Partial Differential Equations: The- ory, Methods and Applications, ser

F. Tr ¨oltzsch,Optimal Control of Partial Differential Equations: The- ory, Methods and Applications, ser. Graduate Studies in Mathematics. Providence, RI: American Mathematical Society, 2010, vol. 112

2010
[4]

Approximation of high-dimensional para- metric pdes,

A. Cohen and R. DeV ore, “Approximation of high-dimensional para- metric pdes,”Acta Numerica, vol. 24, pp. 1–159, 2015

2015
[5]

Sequential-in-time training of nonlinear parametrizations for solv- ing time-dependent partial differential equations,

H. Zhang, Y . Chen, E. Vanden-Eijnden, and B. Peherstorfer, “Sequential-in-time training of nonlinear parametrizations for solv- ing time-dependent partial differential equations,”arXiv preprint arXiv:2404.01145, 2024

work page arXiv 2024
[6]

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686–707, 2019

2019
[7]

Physics-informed machine learning,

G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, “Physics-informed machine learning,”Nature Reviews Physics, vol. 3, no. 6, pp. 422–440, 2021

2021
[8]

Control of partial differential equations via physics-informed neural networks,

C. J. Garc ´ıa-Cervera, M. Kessler, and F. Periago, “Control of partial differential equations via physics-informed neural networks,”Journal of Optimization Theory and Applications, vol. 196, no. 2, pp. 391–414, 2023

2023
[9]

Optimal control of PDEs using physics- informed neural networks,

S. Mowlavi and S. Nabi, “Optimal control of PDEs using physics- informed neural networks,”Journal of Computational Physics, vol. 473, p. 111731, 2023

2023
[10]

Physics- informed neural networks for PDE-constrained optimization and con- trol,

J. Barry-Straume, A. Sarshar, A. A. Popov, and A. Sandu, “Physics- informed neural networks for PDE-constrained optimization and con- trol,”Communications on Applied Mathematics and Computation, 2025

2025
[11]

Physics-informed deep learning approach to solve optimal control problem,

K.-M. Na and C.-H. Lee, “Physics-informed deep learning approach to solve optimal control problem,” inAIAA SciTech 2024 Forum, 2024, p. 0945

2024
[12]

Physics- informed neural networks for pde-constrained optimization and con- trol,

J. Barry-Straume, A. Sarshar, A. A. Popov, and A. Sandu, “Physics- informed neural networks for pde-constrained optimization and con- trol,”Communications on Applied Mathematics and Computation, pp. 1–24, 2025

2025
[13]

Sympocnet: Solv- ing optimal control problems with applications to high-dimensional multiagent path planning problems,

T. Meng, Z. Zhang, J. Darbon, and G. Karniadakis, “Sympocnet: Solv- ing optimal control problems with applications to high-dimensional multiagent path planning problems,”SIAM Journal on Scientific Com- puting, vol. 44, no. 6, pp. B1341–B1368, 2022

2022
[14]

A time- dependent symplectic network for nonconvex path planning problems with linear and nonlinear dynamics,

Z. Zhang, C. Wang, S. Liu, J. Darbon, and G. E. Karniadakis, “A time- dependent symplectic network for nonconvex path planning problems with linear and nonlinear dynamics,”SIAM Journal on Scientific Computing, vol. 47, no. 4, pp. C769–C794, 2025

2025
[15]

A pinn approach for the online identification and control of unknown pdes,

A. Alla, G. Bertaglia, and E. Calzola, “A pinn approach for the online identification and control of unknown pdes,”Journal of Optimization Theory and Applications, vol. 206, no. 1, p. 8, 2025

2025
[16]

Receding horizon optimal control for the wave equation,

N. Altm ¨uller, L. Gr ¨une, and K. Worthmann, “Receding horizon optimal control for the wave equation,” in49th IEEE Conference on Decision and Control (CDC), 2010, pp. 3427–3432

2010
[17]

State-dependent riccati equation feedback stabilization for nonlinear pdes,

A. Alla, D. Kalise, and V . Simoncini, “State-dependent riccati equation feedback stabilization for nonlinear pdes,”Adv Comput Math, vol. 49, no. 9, 2023

2023
[18]

Optimizing the optimizer for physics-informed neural networks and kolmogorov-arnold networks,

E. Kiyani, K. Shukla, J. F. Urb ´an, J. Darbon, and G. E. Karniadakis, “Optimizing the optimizer for physics-informed neural networks and kolmogorov-arnold networks,”Computer Methods in Applied Mechan- ics and Engineering, vol. 446, p. 118308, 2025

2025