Architecture¶
System Layout¶
┌─────────────────── Dell k3s (100.95.212.93) ──────────────────┐
│ │
│ rig-conductor namespace │
│ ┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ rig-conductor │ │ Valkey │ │ Postgres │ │ Cost │ │
│ │ API (.NET) │ │ (Redis) │ │ (Marten) │ │Dashboard │ │
│ │ - webhooks │ │ - streams│ │ - events │ │ │ │
│ │ - merge │ │ - signals│ │ - logs │ │ │ │
│ │ - dashboard │ │ - session│ │ - costs │ │ │ │
│ └──────┬───────┘ └─────┬────┘ └──────────┘ └──────────┘ │
│ │ │ │
│ dev-e namespace (KEDA scale-to-zero) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Dev-E (Node) │ │Dev-E (Dotnet)│ │Dev-E (Python)│ │
│ │ :node image │ │ :dotnet image│ │ :python image│ │
│ │ 0-1 replicas │ │ 0-1 replicas │ │ 0-1 replicas │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ review-e namespace (KEDA scale-to-zero) │
│ ┌──────────────┐ │
│ │ Review-E │ │
│ │ 0-1 replicas │ │
│ └──────────────┘ │
│ │
│ keda namespace │
│ ┌──────────────┐ │
│ │ KEDA 2.16 │ Watches signal:{agentId} lists │
│ └──────────────┘ │
└────────────────────────────────────────────────────────────────┘
┌──── Mac Mini M4 ────┐ ┌──── Human Dev ──────┐
│ iBuild-E (launchd) │ │ Claude Code / Codex │
│ Polls rig-conductor │ │ conductor-e-hook.sh │
│ iOS/macOS tasks │ │ Reports to Conductor │
└─────────────────────┘ └──────────────────────┘
Event-Driven Pipeline¶
Every action flows through rig-conductor's event store. No polling, no timers.
Issue labeled "agent-ready"
│
▼ GitHub webhook
rig-conductor
├─ Records ISSUE_APPROVED event
├─ Reads .rig-agent.yaml → determines stack (node/dotnet/python/ios)
├─ Routes to agent: XADD assignments:dev-e-{stack} + LPUSH signal:dev-e-{stack}
├─ Records ISSUE_ASSIGNED event
│
▼ KEDA detects signal (LLEN > 0) → scales 0→1
Dev-E pod starts (~25s)
├─ Consumes from Valkey stream (XREADGROUP)
├─ Creates execution log in rig-conductor
├─ Runs Claude CLI with issue prompt
├─ Clones repo, creates branch, implements, tests, commits, pushes
├─ Creates PR with "Closes #N"
│
▼ GitHub webhook (pull_request.opened)
rig-conductor
├─ Records PR_CREATED event (links PR to issue)
├─ Records REVIEW_ASSIGNED event
├─ Routes to Review-E: XADD assignments:review-e + LPUSH signal:review-e
│
▼ KEDA wakes Review-E
Review-E pod starts
├─ Reviews PR diff
├─ Approves or requests changes
│
├─ If APPROVED:
│ ├─ Stream consumer detects "approved" in output
│ ├─ Posts REVIEW_PASSED event
│ ├─ Calls POST /api/merge
│ │
│ ▼ rig-conductor merges
│ ├─ Waits for CI clean (polls mergeable_state)
│ ├─ Checks for do-not-merge label
│ ├─ Squash merges via GitHub API
│ ├─ Records MERGED + ISSUE_DONE events
│
├─ If CHANGES_REQUESTED:
│ ├─ Records REVIEW_DISPUTED event
│ ├─ Routes back to Dev-E with review feedback
│ ├─ Clears review dedup (allows re-review after fix)
│ ▼ Dev-E iterates on same branch, pushes fix
│ └─ Webhook (synchronize) → re-routes to Review-E
│
▼ KEDA cooldown (5 min) → scales 1→0
Repos¶
| Repo | Visibility | Purpose |
|---|---|---|
| rig-agent-runtime | Public | Agent runtime — Dockerfiles, Helm chart, stream consumer, CLI providers |
| rig-conductor | Private | .NET API — event store, webhooks, dashboard, merge logic |
| rig-gitops | Private | FluxCD manifests — HelmReleases, KEDA ScaledObjects, secrets |
| rig-tools | Private | Developer hooks, workflow sync script, install.sh |
| infra | Private | Terraform — GitHub, Cloudflare, GCP, k8s config |
Multi-Stack Images¶
All images extend rig-agent-runtime:base (git, gh, claude-cli, codex-cli, Node.js 22):
| Image | Extra Tools | For |
|---|---|---|
:node |
TypeScript, Jest, ESLint, Prettier | JS/TS repos |
:dotnet |
.NET 10 SDK | C# repos |
:python |
Python 3, pytest, black, ruff | Python repos |
:base |
Core tools only | Default |
Agents can install additional tools at runtime (npm, pip, apt-get).
Per-Repo Config¶
Each repo has .rig-agent.yaml:
stack: node # which image to use
tools: # extra tools to install
- firebase-tools
testCommand: npm test
buildCommand: npm run build
escalate:
- "needs Xcode (requires-macos)"
rig-conductor reads this on every assignment to determine routing.
KEDA Scale-to-Zero¶
Agents scale to zero when idle. Wake-up uses a signal list pattern:
- rig-conductor publishes:
XADD assignments:{agent}(work) +LPUSH signal:{agent}(wake signal) - KEDA watches
LLEN signal:{agent}every 15 seconds - When LLEN > 0 → scales deployment 0→1
- Agent starts, deletes signal key, processes work from stream
- After 5 min idle → KEDA scales 1→0
This solves the chicken-and-egg problem with Redis Streams (stream scaler needs a consumer, but consumer is in the pod that's at 0).
Human Developer Integration¶
Humans using any AI tool report to rig-conductor via rig-tools:
# Install (one time)
git clone git@github.com:Stig-Johnny/rig-tools.git && cd rig-tools && ./install.sh
# Automatic for Claude Code (via hooks in settings.json)
# Manual for other tools:
conductor-e-hook WORK_STARTED
conductor-e-hook PR_CREATED --pr 42 --url https://...
Dashboard shows human developers alongside AI agents.
Dashboard¶
URL: https://rig-conductor.dashecorp.com/
Tabs: - Overview — Agent status (online/offline, provider, task), queue depth - Issues — All tracked issues with state, PR, agent, cost. Sortable + filterable. - Events — Live SSE event stream - Costs — Per-agent cost breakdown - Logs — Live agent CLI output (Valkey pub/sub → SSE)
Issue detail panel shows: - Event timeline (every webhook + agent event) - Cost per step - Execution runs (turns, tokens in/out/cache, duration, PR link)
Light/dark mode. Agent filter (online, 24h, 7d, all).
Execution Logs¶
Stored in Marten (PostgreSQL) as ExecutionLog documents:
Run: dev-e-node/5n72v · 12 Apr 09:20 · PR #131
Status: completed · 93s · $0.21 · 14 turns
Tokens: 1,234 in / 567 out / 10,921 cache-read / 4,305 cache-write
Steps:
assigned ✓ Assigned to dev-e-node
implement ✓ Created feature branch, implemented, pushed
Retention: raw logs 30 days, summaries 90 days.
Resource Requests¶
Single-node cluster rule: 1m CPU / 1Mi memory requests for all pods. No CPU limits (burstable). This ensures pods always schedule regardless of node pressure.
Credentials¶
| Secret | Type | Rotation |
|---|---|---|
| Claude OAuth token | sk-ant-oat01-... |
1-year, in Bitwarden |
| GitHub App PEM | Per-agent apps | Auto-refresh (1h tokens) |
| RELEASE_PAT | Review-E PAT | Expires 2026-05-30 |
| GHCR pull secret | Container registry | Long-lived |
| Discord bot token | Per-agent bots | Long-lived |