2010.01742
A Convex Approach to Data-driven Optimal Control via Perron-Frobenius and Koopman Operators
Bowen Huang, Umesh Vaidya
incompletemedium confidence
- Category
- Not specified
- Journal tier
- Specialist/Solid
- Processed
- Sep 28, 2025, 12:55 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
The paper’s Theorem 5 asserts that the infinite-horizon OCP with feedback u=k(x) can be “written as” an infinite-dimensional convex program over densities (ρ, ρ̄) with constraint ∇·(fρ+gρ̄)=h and objective ∫(qρ+ρ̄^T R ρ̄/ρ), with recovery k=ρ̄/ρ (equations (20)–(22) and (21)) . The proof shows that, for any fixed feedback k, the associated occupation density ρ(x)=∫_0^∞[P_t^c h](x)dt satisfies ∇·((f+gk)ρ)=h and reproduces the cost via ⟨q+k^TRk,ρ⟩ (equations (27)–(31)) . This establishes that every feedback induces a feasible (ρ,ρ̄) with the same value, i.e., the convex program is a relaxation of the OCP. However, the paper does not prove the reverse inequality (that every feasible (ρ,ρ̄) yields a realizable feedback with matching occupation density and no relaxation gap), nor does it provide a dual (HJB-type) verification argument or a no-duality-gap result; see also the remark that more technical results are deferred . By contrast, the candidate solution supplies the missing pieces: (i) a careful distributional/measure-theoretic formulation and the perspective convention for the term ρ̄^T R ρ̄/ρ; (ii) a convex dual yielding the HJB subsolution inequality q+∇V·f−(1/4)∇V·gR^{-1}g^T∇V≥0, furnishing tight lower bounds on all feedback costs; (iii) Fenchel–Rockafellar-based no-duality-gap under the stated feasibility/stability assumptions; and (iv) KKT-based synthesis showing that k*=−(1/2)R^{-1}g^T∇V* is optimal and that J(µ;k*)=∫V*h. These elements complete the equivalence proof the paper sketches but does not fully establish. The paper’s additional implementation details (exclusion of a small neighborhood N to handle the singularity of ρ at the equilibrium and the data-driven approximations (41)–(42)) are consistent with the model’s setup but orthogonal to the missing theoretical direction .
Referee report (LaTeX)
\textbf{Recommendation:} major revisions
\textbf{Journal Tier:} specialist/solid
\textbf{Justification:}
The paper offers a promising convex density-based formulation of nonlinear OCP tied to data-driven PF/Koopman operator approximations. The forward mapping from feedback to density variables is sound and practically useful. However, the central theorem’s claim of full equivalence remains only partially justified: the reverse direction (that an optimizer of the convex program yields an optimal feedback with no relaxation gap) is not proved, and no dual/verification argument is provided. Adding a rigorous dual (HJB-subsolution) framework and a no-duality-gap result would substantively strengthen correctness. As it stands, the contribution is valuable but requires theoretical reinforcement.