docs: add WORK_IN_PROGRESS.md and document false-positive protection

- WORK_IN_PROGRESS.md captures the v0.2.1→v0.2.3 incident, root cause,
  and the optional follow-ups (preserve dedicated sessions during swap,
  Telegram alert on SwapRequested, /quota/status endpoint).
- architecture.md §2.2.1 describes the four-layer defense:
  strict patterns, 5xx veto, two-poll confirmation, post-swap cooldown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ubuntu 2026-04-15 19:51:15 +00:00
parent 62e98cb9e7
commit 5cad53ac7a
2 changed files with 105 additions and 0 deletions

View file

@ -75,6 +75,42 @@ re-implements dispatch selection natively in Go.
- Emits `QuotaWarning` and `SwapRequested` events when thresholds in
the config are crossed.
#### 2.2.1 False-positive protection (v0.2.3+)
Three layers prevent transient upstream errors from being mistaken for
quota exhaustion and triggering useless — and destructive — swaps:
1. **Strict pattern matching** (`isQuotaExhausted`). `quotaPatterns`
keys on specific phrases that only a real 429 surfaces:
`"you've hit your limit"`, `"rate_limit_error"` (Anthropic typed
error), `"quota exceeded"`, `"usage limit reached"`,
`"claude pro usage"`, `"too many requests"`, `"5-hour limit"`.
The generic substring `"rate limit"` is **not** a pattern — it
matches inside the bodies of unrelated error transcripts.
2. **Server-error veto** (`hasServerError`). If the same pane also
contains `"api_error"`, `"overloaded_error"`, `"internal server
error"`, or `"api error: 5"`, the quota match is vetoed. An
Anthropic 500/503 response is surfaced in the Claude Code
conversation transcript and stays visible until the user scrolls;
without this veto it would be re-matched on every poll.
3. **Two-poll confirmation** (`Monitor.suspectedHitAt`). A hit with no
parseable reset time (real 429s always include one, see
`extractResetTime`) is treated as *suspected* on the first poll and
only emits `SwapRequested` if a second consecutive poll also
detects the same condition. A single-poll flash is absorbed.
4. **Post-swap cooldown** (`state.QuotaState.LastSwapAt` +
`quota.reactivate_cooldown`, default 5m). After a swap, the monitor
suppresses all detection for the cooldown window, breaking the
ping-pong failure mode where both accounts appear exhausted in
alternation.
Forensic logging on every `SwapRequested` includes the triggering
session name, the matched pattern, and a 120-char snippet of the pane
so production incidents can be diagnosed from `journalctl` alone.
### 2.3 session-watcher
- Maintains an in-memory table keyed by tmux session name (`ccl-*`).