Last week a bot embarrassed us. Cursor Bugbot ran across five PRs on our agent security testing framework and filed nine real issues: an HTTP 413 handler that returned an empty body, undefined variables that only surfaced in live mode, regex patterns being compared as literal substrings, and a metric definition in an arXiv citation that directly contradicted what we were computing. Every finding w