Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
02624d4
feat(poolops): extract channel-agnostic pool operations
nnemirovsky May 18, 2026
115360a
feat(cli): wire pool subcommand through poolops
nnemirovsky May 18, 2026
45b1d38
feat(api): rest endpoints for credential pools
nnemirovsky May 18, 2026
5d09156
feat(telegram): /pool create|list|status|rotate|remove commands
nnemirovsky May 18, 2026
397fa77
fix(proxy): scope pool token-host phantom expansion to refresh_token …
nnemirovsky May 18, 2026
cb12269
feat(telegram): friendlier pool failover notification text
nnemirovsky May 18, 2026
f2e049c
docs: document /pool and /api/pools surfaces; channel-parity plan
nnemirovsky May 18, 2026
dbc636b
fix(api): dedicated pool-referenced 409 schema; build create 201 from…
nnemirovsky May 18, 2026
ee4c16b
fix(proxy): gate grant_type probe to token-host POSTs; raise+observe …
nnemirovsky May 18, 2026
cb095a7
fix(proxy): drop awkward parenthetical from empty-reason failover notice
nnemirovsky May 18, 2026
148a54e
fix(telegram): escape pool name in rotate-race hint; assert pool removal
nnemirovsky May 18, 2026
3cb793d
docs(readme): note bearer auth required for REST API examples
nnemirovsky May 18, 2026
773f8a6
fix(telegram): drop dead ErrCredentialInUseByPool branch in poolRemove
nnemirovsky May 18, 2026
99703b0
fix(api): map internal pool-create failures to 500 not 400
nnemirovsky May 18, 2026
7986156
fix(proxy): make unknown failover reason read naturally; drop dead em…
nnemirovsky May 18, 2026
49dacdf
fix(telegram): HTML-escape pool member LastFailureReason in status
nnemirovsky May 18, 2026
887a72e
fix(telegram): html-escape bind hint in pool create reply
nnemirovsky May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,8 @@ sluice pool rotate <name> # operator override: advance active member
sluice pool remove <name>
```

Pools are reachable from all channels — CLI `sluice pool`, REST `/api/pools` (`GET`/`POST`, `GET`/`DELETE /api/pools/{name}`, `POST /api/pools/{name}/rotate`), and Telegram `/pool` — all via the channel-agnostic `internal/poolops`.

Auto-failover on 429/401 is primary; `pool rotate` is an override.

**Data model (migration `000006_credential_pools`):** `credential_pools` (name, strategy reserved `failover`), `credential_pool_members` (ordered, pool->credential FK), `credential_health` (`healthy|cooldown`, `cooldown_until`, `last_failure_reason`), all CHECK-constrained. Store API in `internal/store/pools.go`. `reloadAll` loads pool+health into an atomic-pointer-swapped `PoolResolver` (`internal/vault/pool.go`), rewired via `srv.StorePool`/`SetPoolResolver` on SIGHUP and the 2s data-version watcher.
Expand Down
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,11 @@ Manage sluice from your phone. Approve connections and tool calls, add credentia
| `/mcp list` | List registered MCP upstreams |
| `/mcp add <name> --command <cmd> [flags]` | Register a new MCP upstream (stdio/http/websocket, see `/help`; chat message auto-deleted because `--env` may carry secrets) |
| `/mcp remove <name>` | Remove an MCP upstream |
| `/pool create <name> <a,b[,c]>` | Create a credential pool (ordered OAuth members, failover order) |
| `/pool list` | List credential pools |
| `/pool status <name>` | Active member and per-member health |
| `/pool rotate <name>` | Operator override: advance the active member |
| `/pool remove <name>` | Remove a credential pool |
| `/status` | Proxy stats and pending approvals |
| `/audit recent [N]` | Last N audit entries |

Expand All @@ -338,6 +343,11 @@ Manage sluice from your phone. Approve connections and tool calls, add credentia

REST API on port 3000 for programmatic approval integration. `GET /api/approvals` lists pending requests, `POST /api/approvals/{id}/resolve` resolves them. Use this to build custom approval UIs or integrate with existing workflows.

All `/api/*` endpoints below are protected by bearer auth. Every request must
send `Authorization: Bearer $SLUICE_API_TOKEN` (the token sluice prints at
startup). The curl examples omit the header for brevity, but it is required
for the credential, pool, and rule calls shown here.

Credential management endpoints support both static and OAuth types:

```bash
Expand All @@ -350,6 +360,26 @@ curl -X POST http://localhost:3000/api/credentials \
-d '{"name":"openai_oauth","type":"oauth","token_url":"https://auth.example.com/token","access_token":"at-xxx","refresh_token":"rt-xxx","destination":"api.openai.com","env_var":"OPENAI_API_KEY"}'
```

Credential pools are managed over the same REST surface as the CLI `sluice pool` and Telegram `/pool` commands:

```bash
# List pools
curl http://localhost:3000/api/pools

# Create a pool (members are ordered OAuth credential names; strategy defaults to "failover")
curl -X POST http://localhost:3000/api/pools \
-d '{"name":"openai","members":["codex_a","codex_b"]}'

# Pool status (active member + per-member health)
curl http://localhost:3000/api/pools/openai

# Operator override: advance the active member
curl -X POST http://localhost:3000/api/pools/openai/rotate

# Remove a pool
curl -X DELETE http://localhost:3000/api/pools/openai
Comment thread
nnemirovsky marked this conversation as resolved.
```

## Data Loss Prevention

Two complementary inspection layers protect against credential leakage and dangerous tool use:
Expand Down
255 changes: 255 additions & 0 deletions api/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -519,6 +519,139 @@ paths:
schema:
$ref: "#/components/schemas/ErrorResponse"

/api/pools:
get:
operationId: getApiPools
summary: List credential pools
tags: [pools]
responses:
"200":
description: Credential pools
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/Pool"
post:
operationId: postApiPools
summary: Create a credential pool
tags: [pools]
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/CreatePoolRequest"
responses:
"201":
description: Pool created
content:
application/json:
schema:
$ref: "#/components/schemas/Pool"
"400":
description: Invalid request
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
"409":
description: Pool name collides or a member is already pooled
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
"500":
description: Internal error creating the pool (transaction/DB failure)
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"

/api/pools/{name}:
get:
operationId: getApiPoolsName
summary: Pool status (active member + per-member health)
tags: [pools]
parameters:
- name: name
in: path
required: true
schema:
type: string
responses:
"200":
description: Pool status
content:
application/json:
schema:
$ref: "#/components/schemas/PoolStatus"
"404":
description: Pool not found
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
delete:
operationId: deleteApiPoolsName
summary: Remove a credential pool
tags: [pools]
parameters:
- name: name
in: path
required: true
schema:
type: string
responses:
"204":
description: Pool removed
"404":
description: Pool not found
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
"409":
description: >-
Pool still referenced by one or more bindings. The response body's
`bindings` field lists the blocking bindings (id + destination).
content:
application/json:
schema:
$ref: "#/components/schemas/PoolReferencedErrorResponse"

/api/pools/{name}/rotate:
post:
operationId: postApiPoolsNameRotate
summary: Operator override — advance the active pool member
tags: [pools]
parameters:
- name: name
in: path
required: true
schema:
type: string
responses:
"200":
description: Rotation result
content:
application/json:
schema:
$ref: "#/components/schemas/PoolRotateResult"
"404":
description: Pool not found
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
"409":
description: Rotate raced a concurrent membership change
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"

/api/audit/recent:
get:
operationId: getApiAuditRecent
Expand Down Expand Up @@ -619,6 +752,36 @@ components:
code:
type: string

PoolReferencedErrorResponse:
description: >-
409 body for DELETE /api/pools/{name} when the pool is still
referenced by one or more bindings. Carries the generic error/code
plus the structured list of blocking bindings (id + destination); the
CLI and Telegram surfaces render the same list (channel parity). This
is a dedicated schema so the generic ErrorResponse envelope is not
coupled to one endpoint.
type: object
required: [error, bindings]
properties:
error:
type: string
code:
type: string
bindings:
type: array
items:
$ref: "#/components/schemas/PoolReferencingBinding"

PoolReferencingBinding:
type: object
required: [id, destination]
properties:
id:
type: integer
format: int64
destination:
type: string

HealthResponse:
type: object
required: [status]
Expand Down Expand Up @@ -1009,6 +1172,98 @@ components:
timeout_sec:
type: integer

Pool:
type: object
required: [name, strategy, members]
properties:
name:
type: string
strategy:
type: string
description: "Pool strategy (only 'failover' is supported)"
created_at:
type: string
format: date-time
members:
type: array
items:
$ref: "#/components/schemas/PoolMember"

PoolMember:
type: object
required: [credential, position]
properties:
credential:
type: string
position:
type: integer

CreatePoolRequest:
type: object
required: [name, members]
properties:
name:
type: string
strategy:
type: string
description: "Pool strategy; defaults to 'failover' when omitted"
members:
type: array
description: "Ordered member credential names (failover order)"
items:
type: string

PoolStatus:
type: object
required: [name, strategy, active, members]
properties:
name:
type: string
strategy:
type: string
active:
type: string
description: "Currently active member credential name"
members:
type: array
items:
$ref: "#/components/schemas/PoolMemberStatus"

PoolMemberStatus:
type: object
required: [credential, position, active, state]
properties:
credential:
type: string
position:
type: integer
active:
type: boolean
state:
type: string
description: "healthy, cooldown, or healthy (cooldown expired)"
cooldown_until:
type: string
format: date-time
last_failure_reason:
type: string

PoolRotateResult:
type: object
required: [pool, from, to]
properties:
pool:
type: string
from:
type: string
description: "Member that was active and is now parked"
to:
type: string
description: "New active member after the rotation"
parked_until:
type: string
format: date-time

AuditEntry:
type: object
required: [timestamp, verdict]
Expand Down
7 changes: 1 addition & 6 deletions cmd/sluice/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -516,12 +516,7 @@ func main() {
// Exhausted: no distinct member to fail over to (every
// member cooling) — report it as pool exhaustion, NOT a
// self-referential "X -> X" transition.
msg := fmt.Sprintf("pool %s failed over %s -> %s (%s)",
ev.Pool, ev.From, ev.To, ev.Reason)
if ev.Exhausted {
msg = fmt.Sprintf("pool %s exhausted: all members cooling down (%s); no healthy account to fail over to",
ev.Pool, ev.Reason)
}
msg := proxy.FormatFailoverNotice(ev)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
for _, ch := range failoverBroker.Channels() {
Expand Down
Loading
Loading