Infinite-dimensional Folded-in-time Deep Neural Networks

Florian Stelzer, Serhiy Yanchuk

correcthigh confidence

Category: math.DS
Journal tier: Specialist/Solid
Processed: Sep 28, 2025, 12:55 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper’s Theorem 1 states functional backpropagation formulas for L1-Fit-DNNs, including Δout = ŷ − y, the last-layer error signal ΔL, the backward recursions for Δℓ, and parameter gradients; the proof proceeds by a functional chain rule with explicit calculations of δx/δā and integral interchanges. The candidate solution reproduces the same layer-by-layer Gateaux/Fréchet calculus and Tonelli/Fubini swaps, arriving at the same expressions, and explicitly flags one subtlety: an indicator factor χ_[0,T](s−τ′d) that is present in the rigorous derivation of δāℓ+1(s)/δāℓ(ω0) but is omitted in the paper’s displayed Δℓ(ω0) recursion. The paper informally relies on the extension xℓ(u)=0 for u∉[0,T] and uses this in the derivation text, but the corresponding indicator is missing in the final Δℓ(ω0) display; the candidate notes that taking the canonical representative of Mℓ+1,d to vanish on {s: s−τ′d∉[0,T]} reconciles the formulas. Aside from this minor display-level omission, the methods and results are in substantive agreement, with matching proofs and parameter gradients (items (iv)–(vii)) as stated in Theorem 1 , and proved via the chain rule and the derivative identities (Eqs. (42)–(46), (49)–(54)) .

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} specialist/solid

\textbf{Justification:}

The paper gives a rigorous functional backpropagation scheme for delay-based, infinite-dimensional neural networks (L1-Fit-DNNs), with clean statements and proofs and a bridge to discrete-time approximations. The math matches standard expectations while extending scope. A small presentation fix (an omitted indicator and a slightly misleading sentence) will improve precision, but the core results stand and are useful.