feat: SessionLifecycleManager — auto-detect and repair dead tmux sessions

- Add internal/lifecycle/manager.go with Manager struct, Run() ticker loop
  (15s interval), EnsureAllSessions() for boot-time session creation, and
  reconcile() that recreates idle sessions and recovers working ones via
  SetFailed + CreateSession
- Add state.SetFailed() to record crash timestamp on SessionState
- Add internal/lifecycle/manager_test.go with mock tmux client and 3 tests:
  TestReconcileCreatesDeadSession, TestReconcileRecoversCrashedSession,
  TestEnsureAllSessions — all pass
- Wire lifecycle.Manager into cmd/claude-failover/main.go after state init

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Ubuntu 2026-04-14 18:02:25 +00:00
parent 2d43580c18
commit 978b60ccf7
10 changed files with 810 additions and 32 deletions

6
.gitignore vendored
View file

@ -31,8 +31,8 @@ config.local.yaml
*.swo
.DS_Store
# Runtime / state
state/
checkpoints/
# Runtime / state (top-level only, not internal/state package)
/state/
/checkpoints/
tmp/
.agent-queue/