On the Global Convergence of Randomized Coordinate Gradient Descent for Non-Convex Optimization

Ziang Chen, Yingzhou Li, Jianfeng Lu

correcthigh confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:55 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper proves the main claim rigorously: under Assumptions 1–3 and 0 < α_min < α_max < 1/M, randomized coordinate gradient descent almost surely avoids strict saddles, via a careful finite-block amplification near strict saddles, a positive Lyapunov exponent for the linearized cocycle (Proposition 3.1), and a law-of-large-numbers–style argument (Theorems 4.4 and 4.9 leading to Theorem 1) . The model’s solution, while thematically aligned (RDS viewpoint, Oseledets splitting, nontrivial unstable projection), misapplies off-the-shelf random stable manifold theorems in a setting the paper explicitly cautions against without additional quantitative control , and crucially asserts that x_t (or a linear functional of it) is an affine function of a past step size α_s. That “affine-in-α_s” claim is incorrect for nonlinear f: the dependence of later iterates on α_s is nonlinear via ∇f(x_k). The paper avoids this pitfall by deriving finite-time, high-probability amplification bounds instead of relying on such an affine argument.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The paper provides a careful, technically solid analysis of randomized coordinate gradient descent via a random dynamical systems lens. It avoids overreliance on abstract manifold theory, instead establishing finite-time amplification estimates that combine with a LLN-style argument to yield almost sure escape from strict saddles. The results broaden our understanding of nonconvex behavior of randomized coordinate methods and are presented clearly, though some assumptions and constants could be highlighted and a few typographical details improved.