2107.13995
Limiting dynamics for Q-learning with memory one in two-player, two-action games
J. M. Meylahn
correctmedium confidence
- Category
- math.DS
- Journal tier
- Specialist/Solid
- Processed
- Sep 28, 2025, 12:56 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
The paper enumerates all 256 pure memory‑one strategy pairs via an indicator‑function formulation of the Bellman equations and finds that exactly three symmetric pairs solve the equations in the discounted IPD—All‑D (no condition), Grim Trigger for δ > (t−r)/(t−p), and WSLS for δ > (t−r)/(r−p)—with no asymmetric solutions; see the model/setting and payoff enumeration, the All‑D worked example and the summary table, and the statement that only the three symmetric solutions exist and that no asymmetric pair solves the Bellman equations . The candidate solution analytically derives the same three profiles and thresholds and argues that any profile prescribing a mismatched next state (2 or 3) cannot be self‑consistent, which is a different (more analytic) route to the same classification. One discrepancy is in the optional pmBR dynamics note: the paper reports persistent 2‑cycles at medium and large δ, whereas the candidate claims such cycles vanish past certain thresholds; the paper’s pmBR graphs explicitly show the cycles remaining (e.g., between nodes 2 and 17) at δ=0.4 and δ=0.8 . Also, the paper encodes strict best‑responses (>) and thus states strict thresholds, while the candidate allows weak best‑response at equality.
Referee report (LaTeX)
\textbf{Recommendation:} minor revisions
\textbf{Journal Tier:} specialist/solid
\textbf{Justification:}
The paper provides a clear, correct, and reproducible characterization of pure memory-one absorbing states in the discounted IPD, showing that only All-D, GT, and WSLS survive with sharp δ-thresholds, and no asymmetric solutions exist. The pmBR graphs add useful insight about possible limit cycles under simultaneous best-response updates. Minor revisions would improve clarity around the strict-inequality convention and correct small typographical issues.