2009.11742

A Compute-Bound Formulation of Galerkin Model Reduction for Linear Time-Invariant Dynamical Systems

Francesco Rizzi, Eric J. Parish, Patrick J. Blonigan, John Tencer

correctmedium confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:55 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper’s claims are internally consistent and supported by both analysis and measurements: Rank1Galerkin is gemv-dominated with I ≈ 1/4 and thus memory-bound, whereas Rank2Galerkin is gemm-dominated with I ≈ K/16 and thus compute-bound; the paper defines the total-time metric τP and reports s(256,2,32)=12.98; and in the many-query demo it clearly explains the concurrency choices (8 runs in parallel) yielding a Rank2 total of about 7.5 s and an FOM total of 7280 s, for an ≈970× speedup . The candidate solution gets the core kernel identification and arithmetic intensities right and even reproduces the ≈970× ratio, but misapplies τP by assuming idealized parallel occupancy 36/n and continuous fractions of sets; this yields incorrect absolute totals (6.667 s vs 7.5 s for Rank2; 6471 s vs 7280 s for FOM) and a slightly inflated Rank1-to-Rank2 ratio (16.8× instead of ≈15×), despite the coincidentally correct speedup ratio .

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The work convincingly reframes LTI ROM evaluation as a compute-bound problem via a rank-2 formulation that leverages gemm. The analytical intensity discussion, scaling studies, and an end-to-end many-query demonstration substantiate the claimed speedups and practical utility. Minor clarifications around the ensemble-time model and concurrency choices would further improve transparency and reproducibility.