feat: add per-IP rate limiting with configurable env vars and 429 handler#121
feat: add per-IP rate limiting with configurable env vars and 429 handler#121harika880 wants to merge 3 commits into
Conversation
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds an in-memory per-IP rate limiter, middleware and exception handling to surface X-RateLimit headers and 429 responses, wires limiter checks into health, session, and event endpoints (separate read/write limits), includes tests for headers, 429 behavior, client IP handling and pruning, and updates .gitignore. ChangesAPI Rate Limiting and Tests
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)
432-451:⚠️ Potential issue | 🟠 MajorAdd rate limiting for session read endpoints
GET /api/v1/sessions/{session_id}andGET /api/v1/sessions/{session_id}/events(lines 432-451) don’t call_limiter.check(). If the requirement is 1000 req/min for read endpoints, these should be rate-limited (e.g., likeGET /api/v1/sessionsusesRATE_READ). Confirm whether this is intentional scope; otherwise add the limiter check.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agentwatch/api/server.py` around lines 432 - 451, Both GET handlers get_session and get_events are missing rate-limit enforcement; call the existing limiter before processing requests. Add a call to _limiter.check(RATE_READ) (or await if _limiter.check is async) at the start of get_session and get_events, matching how GET /api/v1/sessions uses RATE_READ, so the functions (_collector.get_trace, _collector.get_events) only run after the limiter check and keep existing behavior and return types unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agentwatch/api/server.py`:
- Around line 391-392: The code calls _limiter.check(request.client.host + ':r',
...) which will raise AttributeError when request.client is None; update the
endpoints health, list_sessions, create_session, and ingest_event to safely
derive a client identifier first (e.g., client_host = request.client.host if
request.client else request.headers.get('x-forwarded-for', 'unknown') or a fixed
fallback like 'unknown') and then call _limiter.check(client_host + ':r',
RATE_READ, request) (or appropriate RATE_*); locate the _limiter.check calls in
the functions health, list_sessions, create_session, and ingest_event and
replace the direct request.client.host access with this safe client_host
variable.
- Around line 51-75: The _Limiter._buckets map in the _Limiter class grows
unbounded because entries for unique IPs are never evicted; update _Limiter
(e.g., in the check method or via a lightweight background task started once) to
prune stale bucket entries whose b["start"] is older than the rate window (60s)
+ a small buffer (or a configurable TTL), and ensure pruning is safe for
concurrent access; reference _Limiter, _buckets and check() when adding the TTL
and cleanup logic and make the TTL/window configurable rather than hardcoding.
In `@tests/test_rate_limiting.py`:
- Line 49: The test is asserting the wrong 429 response body; update the
assertion that checks last.json() in tests/test_rate_limiting.py to expect
{"error": "rate_limit_exceeded"} instead of {"detail": "rate_limit_exceeded"} so
it matches Issue `#56`'s contract; ensure any other assertions or helper
expectations that reference "detail" for rate-limit responses are changed to use
"error" (search for last.json() or rate limit test helpers to update all
occurrences).
- Around line 5-7: get_client() currently imports the module-level app which
preserves environment-loaded rate limits and in-memory limiter state across
tests; change get_client() to build a fresh test app instance with explicit
small rate-limit settings (override API_RATE_LIMIT_WRITE/API_RATE_LIMIT_READ for
the test) and reset or create a new limiter/state for each invocation so each
test uses deterministic limits and no accumulated state—locate the import of app
in get_client() and replace it with the factory/constructor call that accepts
configuration or otherwise reinitialize the limiter before returning TestClient.
- Around line 13-15: The test currently allows a 401 to pass (assert
r.status_code in (200, 401)), which hides if the /api/v1/sessions endpoint is
still protected; update the assertion to require a successful public response
(assert r.status_code == 200) and keep the header checks for "X-RateLimit-Limit"
and "X-RateLimit-Remaining" on the response object r so the test verifies the
public contract for /api/v1/sessions.
- Around line 35-50: The test test_429_body_matches_spec currently skips
assertions when no 429 is observed; modify it to explicitly fail if the limiter
never returned 429 by adding an unconditional assertion that last is not None
and last.status_code == 429 (or raise an assertion like "Expected a 429 response
but none was returned") before inspecting last.json() and last.headers; locate
the test function test_429_body_matches_spec and adjust the control flow around
the variable last and the c.post loop to ensure the test fails when no 429
response is produced.
---
Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 432-451: Both GET handlers get_session and get_events are missing
rate-limit enforcement; call the existing limiter before processing requests.
Add a call to _limiter.check(RATE_READ) (or await if _limiter.check is async) at
the start of get_session and get_events, matching how GET /api/v1/sessions uses
RATE_READ, so the functions (_collector.get_trace, _collector.get_events) only
run after the limiter check and keep existing behavior and return types
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d306049c-778d-4114-8761-882897f075a0
📒 Files selected for processing (2)
agentwatch/api/server.pytests/test_rate_limiting.py
Co-authored-by: Cursor <cursoragent@cursor.com>
…erevanth#56) - Prune stale per-IP buckets with configurable TTL - Resolve client IP via X-Forwarded-For when request.client is missing - Return {error: rate_limit_exceeded} per issue spec - Strengthen tests so 429 and headers are required Co-authored-by: Cursor <cursoragent@cursor.com>
|
Analysis CompleteGenerated ECC bundle from 3 commits | Confidence: 55% View Pull Request #124Repository Profile
Changed Files (3)
Top hotspots
Top directories
Analysis Depth Readiness (evidence-backed, 50%)ECC Tools uses this to decide whether recommendations should stay at commit-history/setup guidance or expand into CI, security, harness, reference-set, AI-routing, and team backlog work.
Reference Set Readiness (1/7, 14%)
Likely Future Issues (2)
Suggested Follow-up Work (2)
Copy-ready bodies test: add integration coverage for agentwatch/api/server.py ## Summary
- Add integration or end-to-end coverage for the recently changed API surface.
## Why
- Backfill integration or end-to-end coverage for the changed API surface before more contract changes land.
## Touched paths
- `agentwatch/api/server.py`
## Validation
- Add or extend integration / e2e coverage for the changed API, route, or contract surface.
- Exercise the touched endpoints or route handlers against realistic request / response flows.docs: sync API contract for agentwatch/api/server.py ## Summary
- Update the API contract artifact that should reflect the recently changed implementation surface.
## Why
- Backfill the missing API contract or spec update before another implementation change lands on top of the same surface.
## Touched paths
- `agentwatch/api/server.py`
## Validation
- Update the relevant OpenAPI, GraphQL, or contract/spec artifact used by this repo.
- Run the contract validation, docs generation, or API verification flow that depends on that artifact.Review Activity (1 reviews, 6 inline comments, 0 unresolved threads)
Latest reviewer states
Review Follow-up Signals (2)
Recommended next actions
Detected Workflows (1)
Generated Instincts (19)
After merging, import with: Files
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)
97-104:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
Retry-Aftershould use the remaining window.Once a client is already partway through the bucket, always returning
RATE_WINDOW_SECtells them to wait longer than necessary. Derive the header fromnow - b["start"]so clients can back off accurately.⏳ Proposed fix
headers={ "X-RateLimit-Limit": str(limit), "X-RateLimit-Remaining": "0", - "Retry-After": str(RATE_WINDOW_SEC), + "Retry-After": str(max(1, RATE_WINDOW_SEC - int(now - b["start"]))), },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agentwatch/api/server.py` around lines 97 - 104, The Retry-After header currently always returns RATE_WINDOW_SEC; instead compute the remaining window using the bucket start timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC - elapsed) then use that remaining (as an integer/str) for the "Retry-After" header; update the HTTPException headers in the rate limit branch that references b, limit and RATE_WINDOW_SEC to return this derived remaining value and ensure now is obtained (e.g., time.time()) before computing elapsed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agentwatch/api/server.py`:
- Around line 522-533: The system_status handler (and other handlers like
sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay,
/checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary,
/api/v1/governance/compliance-report, /api/v1/demo/seed) currently bypass the
rate limiter; either call _limiter.check(...) at the start of each of these
route functions (e.g., inside system_status) or refactor by creating a shared
FastAPI dependency or middleware that invokes _limiter.check (and keeps the
existing _require_api_key dependency) so every route automatically enforces the
limiter and new endpoints cannot omit it. Ensure you reference and invoke the
existing _limiter.check and _require_api_key symbols (or wire the limiter into
the global middleware stack) so all the listed handlers are protected.
- Around line 117-123: In _client_ip change the precedence so the socket peer is
used first: return request.client.host when request.client is present, and only
parse and use the x-forwarded-for header
(request.headers.get("x-forwarded-for")) as a fallback when request.client is
None or when explicit proxy-trust logic is enabled; ensure you still split on
commas and strip whitespace when using the header and return "unknown" if
neither source is available.
In `@tests/test_rate_limiting.py`:
- Around line 8-22: The fixture rate_client mutates server.RATE_WRITE and
server.RATE_READ and never restores them; change the fixture to save the
original values (e.g., orig_write = server.RATE_WRITE, orig_read =
server.RATE_READ) before overwriting, then use a yield-based teardown (yield the
TestClient) and after the yield restore server.RATE_WRITE = orig_write and
server.RATE_READ = orig_read and call server.reset_rate_limiter_for_tests()
again to revert global state so other tests aren’t affected; ensure the fixture
still sets env via monkeypatch and returns the TestClient during the yield.
---
Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 97-104: The Retry-After header currently always returns
RATE_WINDOW_SEC; instead compute the remaining window using the bucket start
timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC
- elapsed) then use that remaining (as an integer/str) for the "Retry-After"
header; update the HTTPException headers in the rate limit branch that
references b, limit and RATE_WINDOW_SEC to return this derived remaining value
and ensure now is obtained (e.g., time.time()) before computing elapsed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 08b7ee65-4119-465d-ace4-4b3b55320f96
📒 Files selected for processing (3)
.gitignoreagentwatch/api/server.pytests/test_rate_limiting.py
✅ Files skipped from review due to trivial changes (1)
- .gitignore
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)
97-104:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
Retry-Aftershould use the remaining window.Once a client is already partway through the bucket, always returning
RATE_WINDOW_SECtells them to wait longer than necessary. Derive the header fromnow - b["start"]so clients can back off accurately.⏳ Proposed fix
headers={ "X-RateLimit-Limit": str(limit), "X-RateLimit-Remaining": "0", - "Retry-After": str(RATE_WINDOW_SEC), + "Retry-After": str(max(1, RATE_WINDOW_SEC - int(now - b["start"]))), },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agentwatch/api/server.py` around lines 97 - 104, The Retry-After header currently always returns RATE_WINDOW_SEC; instead compute the remaining window using the bucket start timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC - elapsed) then use that remaining (as an integer/str) for the "Retry-After" header; update the HTTPException headers in the rate limit branch that references b, limit and RATE_WINDOW_SEC to return this derived remaining value and ensure now is obtained (e.g., time.time()) before computing elapsed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agentwatch/api/server.py`:
- Around line 522-533: The system_status handler (and other handlers like
sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay,
/checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary,
/api/v1/governance/compliance-report, /api/v1/demo/seed) currently bypass the
rate limiter; either call _limiter.check(...) at the start of each of these
route functions (e.g., inside system_status) or refactor by creating a shared
FastAPI dependency or middleware that invokes _limiter.check (and keeps the
existing _require_api_key dependency) so every route automatically enforces the
limiter and new endpoints cannot omit it. Ensure you reference and invoke the
existing _limiter.check and _require_api_key symbols (or wire the limiter into
the global middleware stack) so all the listed handlers are protected.
- Around line 117-123: In _client_ip change the precedence so the socket peer is
used first: return request.client.host when request.client is present, and only
parse and use the x-forwarded-for header
(request.headers.get("x-forwarded-for")) as a fallback when request.client is
None or when explicit proxy-trust logic is enabled; ensure you still split on
commas and strip whitespace when using the header and return "unknown" if
neither source is available.
In `@tests/test_rate_limiting.py`:
- Around line 8-22: The fixture rate_client mutates server.RATE_WRITE and
server.RATE_READ and never restores them; change the fixture to save the
original values (e.g., orig_write = server.RATE_WRITE, orig_read =
server.RATE_READ) before overwriting, then use a yield-based teardown (yield the
TestClient) and after the yield restore server.RATE_WRITE = orig_write and
server.RATE_READ = orig_read and call server.reset_rate_limiter_for_tests()
again to revert global state so other tests aren’t affected; ensure the fixture
still sets env via monkeypatch and returns the TestClient during the yield.
---
Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 97-104: The Retry-After header currently always returns
RATE_WINDOW_SEC; instead compute the remaining window using the bucket start
timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC
- elapsed) then use that remaining (as an integer/str) for the "Retry-After"
header; update the HTTPException headers in the rate limit branch that
references b, limit and RATE_WINDOW_SEC to return this derived remaining value
and ensure now is obtained (e.g., time.time()) before computing elapsed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 08b7ee65-4119-465d-ace4-4b3b55320f96
📒 Files selected for processing (3)
.gitignoreagentwatch/api/server.pytests/test_rate_limiting.py
✅ Files skipped from review due to trivial changes (1)
- .gitignore
🛑 Comments failed to post (3)
agentwatch/api/server.py (2)
117-123:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't trust
X-Forwarded-Forahead of the socket peer.Preferring the header here lets a direct client pick a different IP on every request and sidestep the per-IP limiter. Use
request.client.hostfirst, and only fall back toX-Forwarded-Forwhenrequest.clientis unavailable or proxy trust is explicitly configured.🔐 Proposed fix
def _client_ip(request: Request) -> str: - forwarded = request.headers.get("x-forwarded-for") - if forwarded: - return forwarded.split(",")[0].strip() if request.client: return request.client.host + forwarded = request.headers.get("x-forwarded-for") + if forwarded: + candidate = forwarded.split(",")[0].strip() + if candidate: + return candidate return "unknown"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agentwatch/api/server.py` around lines 117 - 123, In _client_ip change the precedence so the socket peer is used first: return request.client.host when request.client is present, and only parse and use the x-forwarded-for header (request.headers.get("x-forwarded-for")) as a fallback when request.client is None or when explicit proxy-trust logic is enabled; ensure you still split on commas and strip whitespace when using the header and return "unknown" if neither source is available.
522-533:
⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftSeveral REST handlers still bypass the limiter.
This only limits
/health, the session list/detail/events routes, and/api/v1/events. The other REST handlers changed below — for example/api/v1/system/status,/api/v1/sessions/{session_id}/trace,/confidence,/reasoning,/cost,/replay,/checkpoints,/rollback,/api/v1/safety/*,/api/v1/dashboard/summary,/api/v1/governance/compliance-report, and/api/v1/demo/seed— never call_limiter.check(...), so the REST surface is still partially unprotected and some of the heaviest read paths remain floodable.Please either wire the remaining handlers directly or move the check into a shared dependency/middleware so new REST endpoints cannot miss it.
Also applies to: 624-813
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agentwatch/api/server.py` around lines 522 - 533, The system_status handler (and other handlers like sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay, /checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary, /api/v1/governance/compliance-report, /api/v1/demo/seed) currently bypass the rate limiter; either call _limiter.check(...) at the start of each of these route functions (e.g., inside system_status) or refactor by creating a shared FastAPI dependency or middleware that invokes _limiter.check (and keeps the existing _require_api_key dependency) so every route automatically enforces the limiter and new endpoints cannot omit it. Ensure you reference and invoke the existing _limiter.check and _require_api_key symbols (or wire the limiter into the global middleware stack) so all the listed handlers are protected.tests/test_rate_limiting.py (1)
8-22:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestore the mutated server limits during fixture teardown.
Line 18 and Line 19 overwrite
server.RATE_WRITE/server.RATE_READ, but the fixture never restores them. That leaks the test-only caps into later tests that importagentwatch.api.server, making the suite order-dependent.♻️ Suggested cleanup in the fixture
+from collections.abc import Iterator + `@pytest.fixture` -def rate_client(monkeypatch: pytest.MonkeyPatch) -> TestClient: +def rate_client(monkeypatch: pytest.MonkeyPatch) -> Iterator[TestClient]: """Fresh limiter with a low write cap so 429 is reachable quickly.""" monkeypatch.setenv("API_RATE_LIMIT_WRITE", "10") monkeypatch.setenv("API_RATE_LIMIT_READ", "1000") monkeypatch.delenv("AGENTWATCH_API_KEY", raising=False) monkeypatch.delenv("AGENTWATCH_ENV", raising=False) import agentwatch.api.server as server - server.RATE_WRITE = int(os.getenv("API_RATE_LIMIT_WRITE", "10")) - server.RATE_READ = int(os.getenv("API_RATE_LIMIT_READ", "1000")) - server.reset_rate_limiter_for_tests() - - return TestClient(server.app, raise_server_exceptions=False) + original_rate_write = server.RATE_WRITE + original_rate_read = server.RATE_READ + server.RATE_WRITE = int(os.getenv("API_RATE_LIMIT_WRITE", "10")) + server.RATE_READ = int(os.getenv("API_RATE_LIMIT_READ", "1000")) + server.reset_rate_limiter_for_tests() + + try: + with TestClient(server.app, raise_server_exceptions=False) as client: + yield client + finally: + server.RATE_WRITE = original_rate_write + server.RATE_READ = original_rate_read + server.reset_rate_limiter_for_tests()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_rate_limiting.py` around lines 8 - 22, The fixture rate_client mutates server.RATE_WRITE and server.RATE_READ and never restores them; change the fixture to save the original values (e.g., orig_write = server.RATE_WRITE, orig_read = server.RATE_READ) before overwriting, then use a yield-based teardown (yield the TestClient) and after the yield restore server.RATE_WRITE = orig_write and server.RATE_READ = orig_read and call server.reset_rate_limiter_for_tests() again to revert global state so other tests aren’t affected; ensure the fixture still sets env via monkeypatch and returns the TestClient during the yield.
Closes #56
Changes:
Summary by CodeRabbit
New Features
Tests
Chores