2012.13869
Neural Closure Models for Dynamical Systems
Abhinav Gupta, Pierre F.J. Lermusiaux
incompletemedium confidence
- Category
- math.DS
- Journal tier
- Strong Field
- Processed
- Sep 28, 2025, 12:55 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
For the discrete-delay case, the paper’s adjoint DDE and gradient match the model’s derivation (advanced-in-backward-time adjoint with delta-sampled loss producing jumps; gradient d_θL = −∫ λ^T ∂_θ fRNN dt), see the stated adjoint (Eq. 14) and gradient formulas (Eqs. 7–8 in the supplement; and main text) which align with the model’s steps . For the distributed-delay (coupled) case, the coupled adjoint system (Eq. 19) also matches the model, including α = μ(0) from the y-variation boundary term . However, the paper’s stated gradient with respect to φ is incomplete: it omits the contribution from the φ-dependence of the initial condition y(0) = ∫_{−τ2}^{−τ1} gNN(h(t); φ) dt appearing in the Lagrangian via α, which yields an extra term −α^T ∫_{−τ2}^{−τ1} ∂_φ gNN(h(t); φ) dt with α = μ(0). The model includes this missing history-correction term explicitly and is therefore correct and strictly more complete for d_φL . As a secondary note, the paper repeatedly refers to δ(t) as the Kronecker delta in continuous time (should be Dirac), and it posits distribution-valued multipliers (e.g., μ = λ δ) that are unnecessary given the fixed, parameter-independent history; these do not alter the main results but obscure the logic .
Referee report (LaTeX)
\textbf{Recommendation:} minor revisions
\textbf{Journal Tier:} strong field
\textbf{Justification:}
The manuscript’s adjoint derivations for discrete and distributed neural DDEs are largely correct and useful. The coupled adjoint system is right, and the discrete-delay case is handled cleanly. The only substantive issue is that the published gradient with respect to the distributed-delay parameters φ omits the history-dependent term induced by y(0)’s φ-dependence; this is easy to fix. Clarifying the delta terminology and avoiding unnecessary distribution-valued multipliers would improve readability.