2112.05387
Layer-Parallel Training of Residual Networks with Auxiliary-Variable Networks
Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li, Bin Dong
correctmedium confidence
- Category
- Not specified
- Journal tier
- Strong Field
- Processed
- Sep 28, 2025, 12:56 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
The paper defines the per-epoch speedup ratio ρ exactly as in Eq. (27) and notes that if Tf+Tb exceeds tψ+t_f^(λ)+t_b^(λ), then a sufficiently large K gives speedup; it also states the finite-K upper bound (Td+Tf+Tb)/(Td+tψ+t_f^(λ)+t_b^(λ)) . The model reproduces Eq. (27) algebraically, gives an explicit K0 threshold for ρ>1 under the same assumption, proves the stated bound, shows the K→∞ limit equals that bound, and establishes monotonicity in K—all consistent with and extending the paper’s sketch. No substantive conflict was found.
Referee report (LaTeX)
\textbf{Recommendation:} minor revisions
\textbf{Journal Tier:} strong field
\textbf{Justification:}
The analysis of speedup is correct and consistent with the parallel training framework. The manuscript would be strengthened by making a few algebraic consequences of Eq. (27) explicit (finite-K bound, asymptotic limit, and an explicit K0 threshold), and by briefly clarifying the practical independence of overhead terms from K. These are minor clarifications that improve readability without changing the substance.