feat: auth gate, rate limiting, and cost attribution#26
feat: auth gate, rate limiting, and cost attribution#26stackbilt-admin merged 3 commits intomainfrom
Conversation
- Rate limiter: sliding window per-tenant using RATELIMIT_KV with tier-based limits (free=20/min, hobby=60, pro=300, enterprise=1000). Returns 429 with Retry-After and X-RateLimit-* headers. - Cost attribution: per-tool credit costs with quality multipliers for image_generate. Reserves quota via edge-auth consumeQuota RPC before tool call, settles (commit/refund) after based on outcome. Free tools (read-only, zero cost) skip quota enforcement. - Scope enforcement: mutation tools require 'generate' scope. tools/list filters catalog to match session scopes. - AuthServiceRpc extended with checkQuota, consumeQuota, and commitOrRefundQuota methods matching edge-auth's entrypoint. - All existing tests updated with new mocks; 25 new tests added. Closes #18 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rebased onto current main (which absorbed C-1 scope enforcement, sprint-2 hardening, and legacy-grant-scope fallback since this PR was opened). Cleanup commit addresses review feedback plus one duplication that surfaced during the rebase. Changes: 1. Remove duplicate scope check at proxyToolCall dispatch. Main's C-1 block enforces scope-to-risk mapping using the RISK_REQUIRED_SCOPES table and returns INVALID_REQUEST with outcome=insufficient_scope. The earlier version added here predated C-1 and returned INVALID_PARAMS with outcome=auth_denied. Keeping only the main/C-1 version — strictly more protective (catches READ_ONLY-with-empty-scopes that the older check skipped) and matches the existing gateway.test.ts expectations. 2. Fix "sliding window" comment in rate-limiter.ts — implementation is fixed-window (const windowStart = now - (now % windowSeconds)). 3. Remove unreachable isReadOnly branch in reserveQuota's catch. Free tools return earlier at `if (cost.baseCost === 0)`; by the time we're in the catch, baseCost > 0 always and the branch was dead. 4. Add 'rate_limited' to AuditArtifact.outcome and use it for the rate- limit denial path (was reusing 'auth_denied', which conflated quota/throttling rejections with auth failures in downstream analytics). 5. Update test/gateway-legacy-scope.test.ts mocks to include the new AuthServiceRpc quota methods (checkQuota, consumeQuota, commitOrRefundQuota) and the RATELIMIT_KV binding. Test passes unchanged afterward. Full suite: 176/176 passing. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aef6170 to
cbc7783
Compare
Review — blocked on CI + four pre-merge cleanupsFeature scope is sound — per-tenant rate limiting, tool cost registry with quality multipliers for image generation, quota reserve/settle via edge-auth RPC, scope-based Concerns (in priority order)
Naming collision heads-up (not a conflict)PR #27 adds a function named Test plan items still unchecked
This is billing-adjacent — silently wrong quota math costs real money. Exercise the integration tests before merge. Required actions
Once CI is green and the cleanups land, ship. |
Pre-merge integration test resultsRan the three integration tests from the PR's test plan against an uploaded preview version ( ResultsTest 1 — Rate-limit 429 on free tier (20/min): PASS Test 2 — Test 3 — Out-of-scope findings worth a follow-up issue (not blockers)
RecommendationShip as-is (CI green, code review cleanups landed). Open follow-ups for the three items above. Test 3's missing end-to-end verification is the only real gap — recommend either provisioning a Pro test tenant for one-time verification, or accepting code-level cost-table review as sufficient sign-off. |
…forcement Align README, user guide, API reference, and architecture docs with the behavior shipped in PR #18 and hardened in PR #26 — corrects stale tier credits/multipliers, documents the fixed-window limiter and 429 semantics, and adds the scope/tier/risk-level enforcement matrix and quota attribution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Implements the three security-critical features from issue #18 (AgentRelay reference spec):
Rate limiting — Sliding window per-tenant rate limiter using
RATELIMIT_KV. Tier-based limits: free=20/min, hobby=60, pro=300, enterprise=1000. Returns429with standardRetry-AfterandX-RateLimit-*headers on all responses.Cost attribution — Per-tool credit costs with quality multipliers for
image_generate. Reserves quota via edge-authconsumeQuotaRPC before each tool call, then commits or refunds based on outcome. Zero-cost tools (read-only lookups) skip quota enforcement entirely.Scope enforcement — Mutation tools (
LOCAL_MUTATION,EXTERNAL_MUTATION,DESTRUCTIVE) require thegeneratescope.tools/listfilters the catalog to only show tools the session has access to. API keys with read-only scopes cannot invoke mutations.New files
src/rate-limiter.ts— Fixed-window rate limiter with KV TTL for auto-cleanupsrc/cost-attribution.ts— Tool cost registry, quota reservation/settlementtest/rate-limiter.test.ts— 8 teststest/cost-attribution.test.ts— 17 testsModified files
src/gateway.ts— Wire rate limiting after auth, quota reserve before tool call, scope check ontools/callsrc/types.ts— AddcheckQuota,consumeQuota,commitOrRefundQuotatoAuthServiceRpc; addRATELIMIT_KVtoGatewayEnvwrangler.toml— AddRATELIMIT_KVnamespace bindingRATELIMIT_KVTest plan
tsc --noEmit)X-RateLimit-Remainingheader decrementsimage_generatewithultra_plusquality, verify 40-credit costCloses #18
🤖 Generated with Claude Code