Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks

Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li, Bin Dong

correctmedium confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:56 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper defines the per-epoch speedup ratio ρ exactly as in Eq. (27) and notes that if Tf+Tb exceeds tψ+t_f^(λ)+t_b^(λ), then a sufficiently large K gives speedup; it also states the finite-K upper bound (Td+Tf+Tb)/(Td+tψ+t_f^(λ)+t_b^(λ)) . The model reproduces Eq. (27) algebraically, gives an explicit K0 threshold for ρ>1 under the same assumption, proves the stated bound, shows the K→∞ limit equals that bound, and establishes monotonicity in K—all consistent with and extending the paper’s sketch. No substantive conflict was found.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The analysis of speedup is correct and consistent with the parallel training framework. The manuscript would be strengthened by making a few algebraic consequences of Eq. (27) explicit (finite-K bound, asymptotic limit, and an explicit K0 threshold), and by briefly clarifying the practical independence of overhead terms from K. These are minor clarifications that improve readability without changing the substance.