2006.15226

Riemannian Optimization on the Symplectic Stiefel Manifold

Bin Gao, Nguyen Thanh Son, P.-A. Absil, Tatjana Stykel

incompletemedium confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:55 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper’s Theorem 5.6 states that for Algorithm 1 (Riemannian gradient with non-monotone Armijo/backtracking and a locally defined retraction), every accumulation point X* with 0_{X*} in the interior of dom(R) is critical. The proof sketches a standard Zhang–Hager-style nonmonotone analysis, shows {c_k} decreases, and derives a summability bound of the form Σ β t_k ||grad f(X_k)||^2 / q_{k+1} ≤ c_0 − c_∞, then asserts lim_{k→∞} t_k ||grad f(X_k)||^2 = 0 along an accumulation subsequence K to conclude t_k → 0 and obtain a contradiction via a mean value argument and the retraction’s first-order property (D(f∘R_{X*})(0)=Df(X*)) . However, from Σ a_k/(k+2) < ∞ one cannot conclude a_k → 0 along an arbitrary subsequence K; this step, used to infer t_k → 0 on K (needed to place the previous trial step t_k/δ inside a fixed neighborhood of 0_{X*}), is not justified as written, at least when α=1 where q_{k+1}=k+2 . Thus, the proof as presented is incomplete for α=1 (it is standardly correct for α∈[0,1) where q_{k+1} is uniformly bounded). The candidate model solution, on the other hand, incorrectly claims monotonic decrease of f(X_k) under the nonmonotone rule by using c_k ≥ f(X_k) to infer f(X_{k+1}) ≤ f(X_k) − β t_k||grad f(X_k)||^2, which reverses the inequality and is false unless α=0 (pure Armijo) . It also imposes an implicit C^2/Lipschitz-gradient-type “uniform descent lemma” not assumed in the paper. Hence, both are incomplete as written.

Referee report (LaTeX)

\textbf{Recommendation:} major revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The manuscript presents a well-structured optimization framework on the symplectic Stiefel manifold and extends nonmonotone Riemannian line-search analysis to locally defined retractions. The overall contribution is significant and timely. However, the central convergence proof contains an unsubstantiated step for the α=1 case, which should be fixed either by restricting α to [0,1) or by adding a lemma that secures t\_k→0 along accumulation subsequences. With this correction, the paper would be solid.