2012.15488

Data-informed Emulators for Multi-Physics Simulations

Hannah Lu, Dinara Ermakova, Haruko Murakami Wainwright, Liange Zheng, Daniel M. Tartakovsky

correctmedium confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:55 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper defines four emulator formulations for Kd (function approximation with/without simulated predictors and dynamic approximation with/without simulated predictors) via equations (2.3)–(2.6), and it constructs practical RF/NN surrogates, with discretized one-step maps derived in Appendix A; it further reports gains from clustering (k-means with DTW) and from including simulated predictors, and documents orders-of-magnitude computational speed-ups (26 h per high-fidelity run vs ≈10 min emulator training) . The model’s solution provides rigorous arguments that align with these empirical findings: universal approximation on compacts (via decision trees/RF partitions or single-hidden-layer NNs), Euler discretization with a discrete Grönwall bound for error propagation of learned one-step maps, Bayes-risk monotonicity when adding observables, and ERM improvement when training per cluster. These are not formally proved in the paper but are consistent with its methodology and results. The model introduces standard but unstated assumptions (continuity/Lipschitz properties, compactness/positivity for ln Kd, fixed cluster assignment), which we flag explicitly. Thus the paper’s claims are supported empirically, and the model supplies compatible theory: both are correct, with different justifications.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The manuscript convincingly demonstrates accurate and efficient RF/NN emulators for a complex THMC system, including thoughtful use of simulated predictors and clustering. Results are thorough and reproducible, and conclusions (e.g., speed-ups) are practically important. Minor additions—clarifying assumptions for dynamic stability, articulating the theoretical intuition for why observables and clustering help, and quantifying speed-up with a simple formula—would improve clarity and generality without altering the main contributions.