- internal/dispatcher: fsnotify sur inbox/, fallback poll 60s, launchAgent - parseFrontmatter YAML, modelForPriority (critical→opus, reste→sonnet) - waitForPrompt polling ❯, buildTaskMessage, 1 tache par session - isSessionFree: check tmux liveness + state idle + cooldown 5min - 5 tests unitaires (parse, model, dispatch, no-session, missing-tmux) - go.mod: ajout fsnotify v1.9.0 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| cmd/claude-failover | ||
| docs | ||
| internal | ||
| scripts | ||
| .gitignore | ||
| CLAUDE.md | ||
| config.example.yaml | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| README.md | ||
| VERSION.md | ||
claude-failover
Go daemon for Claude Code multi-account session orchestration with automatic quota-based failover.
Overview
claude-failover orchestrates a pool of Claude Code sessions running under
multiple Anthropic accounts. When the active account reaches its quota
threshold (5-hour usage window or weekly cap), the daemon transparently fails
over the workload to a backup account without losing in-flight session state.
It is the runtime glue behind the SecuAAS agent pool (ccl-0..ccl-9,
ccl-auto-11..ccl-auto-20) and is engineered to hold sessions warm across
account swaps by sharing the ~/.claude/projects/ transcript tree via
symlinks.
Architecture (goroutines)
The daemon is a single Go binary composed of cooperating goroutines:
- dispatcher — reads
.agent-queue/inbox/*.mdacross registered projects and assigns tasks to idle sessions. - quota-monitor — polls each configured Anthropic account's usage window and triggers a failover when the active account crosses its threshold.
- session-watcher — tracks tmux session liveness (
ccl-*), heartbeats, and.agent-queue/status.jsontransitions (idle / working). - checkpoint — periodically snapshots session context (current task, last tool call, working dir) so an interrupted session can resume on a different account.
- janitor — cleans stale
.dispatchedmarkers, archives olddone/tasks, prunes expired checkpoints. - notifier — pushes state changes (failover fired, session degraded, task failed) to Telegram / MCP dashboard / log aggregator.
- account-switcher — performs the actual swap: stop sessions on account A, rehome symlinks, relaunch sessions on account B, replay last checkpoint. Serialized via a single mutex so only one swap can happen at a time.
All goroutines communicate through typed channels plus a shared state struct
behind a sync.RWMutex. The daemon exposes an HTTP control plane for the
MCP server to query status and force-trigger operations.
Relationship to SecuAAS agent-orchestrator
This project extracts the session-management and failover logic that
currently lives in dev-management/agent-orchestrator/ (shell scripts:
launch-agent.sh, graceful-switch.sh, watchdog.sh,
checkpoint-daemon.sh, start-dedicated-agents.sh) and reimplements it
as a single Go service. See the orchestrator docs for the operational
context this daemon is designed to replace.
Repository layout
cmd/claude-failover/ Main entrypoint
docs/ Architecture, configuration, analysis notes
scripts/ Setup helpers (shared-projects symlink, etc.)
config.example.yaml Annotated example config
Status
Pre-alpha. Design and scaffolding only — no working binary yet.
License
MIT — see LICENSE.