Skip to content

feat: add per-IP rate limiting with configurable env vars and 429 handler#121

Open
harika880 wants to merge 3 commits into
sreerevanth:mainfrom
harika880:feature/rate-limit-final
Open

feat: add per-IP rate limiting with configurable env vars and 429 handler#121
harika880 wants to merge 3 commits into
sreerevanth:mainfrom
harika880:feature/rate-limit-final

Conversation

@harika880
Copy link
Copy Markdown

@harika880 harika880 commented Jun 2, 2026

Closes #56

Changes:

  • Per-IP rate limiting added to REST API endpoints
  • Read endpoints limited to 1000 req/min (API_RATE_LIMIT_READ)
  • Write endpoints limited to 200 req/min (API_RATE_LIMIT_WRITE)
  • X-RateLimit-Limit and X-RateLimit-Remaining headers on all responses
  • 429 with {"detail": "rate_limit_exceeded"} + Retry-After header
  • All 3 tests passing ✅
  • node_modules not tracked ✅

Summary by CodeRabbit

  • New Features

    • Added IP-based rate limiting for API endpoints with standard rate-limit headers and 429 responses when limits are exceeded.
  • Tests

    • Added/updated tests to validate rate-limit headers, 429 responses, client IP handling, and limiter pruning.
  • Chores

    • Expanded ignore rules to more broadly exclude Node dependency directories.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 08b7ee65-4119-465d-ace4-4b3b55320f96

📥 Commits

Reviewing files that changed from the base of the PR and between 8092e01 and 8b53fdb.

📒 Files selected for processing (3)
  • .gitignore
  • agentwatch/api/server.py
  • tests/test_rate_limiting.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

📝 Walkthrough

Walkthrough

Adds an in-memory per-IP rate limiter, middleware and exception handling to surface X-RateLimit headers and 429 responses, wires limiter checks into health, session, and event endpoints (separate read/write limits), includes tests for headers, 429 behavior, client IP handling and pruning, and updates .gitignore.

Changes

API Rate Limiting and Tests

Layer / File(s) Summary
Limiter imports and core implementation
agentwatch/api/server.py
Adds time/defaultdict imports, updates FastAPI imports to include Request, and implements rate-limit config constants and the _Limiter class with per-IP buckets, key extraction, and a reset_rate_limiter_for_tests() helper.
HTTPException handler and rl_headers middleware
agentwatch/api/server.py
Adds a global HTTPException handler to ensure 429 responses include rate-limit headers and a rl_headers middleware that attaches X-RateLimit-Limit and X-RateLimit-Remaining to responses when limiter metadata exists on request.state.
Endpoint wiring to limiter checks
agentwatch/api/server.py
Updates /health, session list/create/get, session events (GET), and events ingest (POST) to accept Request and call _limiter.check(...) using client-IP-based keys with separate read/write limits; startup unchanged.
.gitignore Node ignores
.gitignore
Add broader node_modules/ ignores including recursive **/node_modules/ and include agentwatch-landing/node_modules/ in frontend ignores.
Rate limiting tests and fixtures
tests/test_rate_limiting.py
Introduce rate_client fixture that sets env vars and resets limiter; tests verify X-RateLimit headers on /health, POST exhaustion returns 429 and exact JSON {"error": "rate_limit_exceeded"} with Retry-After, verify x-forwarded-for selection, and test pruning of stale limiter buckets.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰
I counted hops inside the log,
Then paused the stream to mind the bog.
When floods came in, I stamped my paw—
"A 429!"—and fixed the law.
Now headers sing, the API breathes.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding per-IP rate limiting with configurable environment variables and a 429 handler.
Linked Issues check ✅ Passed The implementation meets all acceptance criteria: per-IP rate limiting with configurable env vars, default limits, X-RateLimit headers, 429 response with correct JSON body, and test coverage.
Out of Scope Changes check ✅ Passed The .gitignore changes for Node dependencies are tangentially related but not core to the rate limiting implementation; however, they appear to be necessary housekeeping for the repository.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)

432-451: ⚠️ Potential issue | 🟠 Major

Add rate limiting for session read endpoints
GET /api/v1/sessions/{session_id} and GET /api/v1/sessions/{session_id}/events (lines 432-451) don’t call _limiter.check(). If the requirement is 1000 req/min for read endpoints, these should be rate-limited (e.g., like GET /api/v1/sessions uses RATE_READ). Confirm whether this is intentional scope; otherwise add the limiter check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/api/server.py` around lines 432 - 451, Both GET handlers
get_session and get_events are missing rate-limit enforcement; call the existing
limiter before processing requests. Add a call to _limiter.check(RATE_READ) (or
await if _limiter.check is async) at the start of get_session and get_events,
matching how GET /api/v1/sessions uses RATE_READ, so the functions
(_collector.get_trace, _collector.get_events) only run after the limiter check
and keep existing behavior and return types unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/api/server.py`:
- Around line 391-392: The code calls _limiter.check(request.client.host + ':r',
...) which will raise AttributeError when request.client is None; update the
endpoints health, list_sessions, create_session, and ingest_event to safely
derive a client identifier first (e.g., client_host = request.client.host if
request.client else request.headers.get('x-forwarded-for', 'unknown') or a fixed
fallback like 'unknown') and then call _limiter.check(client_host + ':r',
RATE_READ, request) (or appropriate RATE_*); locate the _limiter.check calls in
the functions health, list_sessions, create_session, and ingest_event and
replace the direct request.client.host access with this safe client_host
variable.
- Around line 51-75: The _Limiter._buckets map in the _Limiter class grows
unbounded because entries for unique IPs are never evicted; update _Limiter
(e.g., in the check method or via a lightweight background task started once) to
prune stale bucket entries whose b["start"] is older than the rate window (60s)
+ a small buffer (or a configurable TTL), and ensure pruning is safe for
concurrent access; reference _Limiter, _buckets and check() when adding the TTL
and cleanup logic and make the TTL/window configurable rather than hardcoding.

In `@tests/test_rate_limiting.py`:
- Line 49: The test is asserting the wrong 429 response body; update the
assertion that checks last.json() in tests/test_rate_limiting.py to expect
{"error": "rate_limit_exceeded"} instead of {"detail": "rate_limit_exceeded"} so
it matches Issue `#56`'s contract; ensure any other assertions or helper
expectations that reference "detail" for rate-limit responses are changed to use
"error" (search for last.json() or rate limit test helpers to update all
occurrences).
- Around line 5-7: get_client() currently imports the module-level app which
preserves environment-loaded rate limits and in-memory limiter state across
tests; change get_client() to build a fresh test app instance with explicit
small rate-limit settings (override API_RATE_LIMIT_WRITE/API_RATE_LIMIT_READ for
the test) and reset or create a new limiter/state for each invocation so each
test uses deterministic limits and no accumulated state—locate the import of app
in get_client() and replace it with the factory/constructor call that accepts
configuration or otherwise reinitialize the limiter before returning TestClient.
- Around line 13-15: The test currently allows a 401 to pass (assert
r.status_code in (200, 401)), which hides if the /api/v1/sessions endpoint is
still protected; update the assertion to require a successful public response
(assert r.status_code == 200) and keep the header checks for "X-RateLimit-Limit"
and "X-RateLimit-Remaining" on the response object r so the test verifies the
public contract for /api/v1/sessions.
- Around line 35-50: The test test_429_body_matches_spec currently skips
assertions when no 429 is observed; modify it to explicitly fail if the limiter
never returned 429 by adding an unconditional assertion that last is not None
and last.status_code == 429 (or raise an assertion like "Expected a 429 response
but none was returned") before inspecting last.json() and last.headers; locate
the test function test_429_body_matches_spec and adjust the control flow around
the variable last and the c.post loop to ensure the test fails when no 429
response is produced.

---

Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 432-451: Both GET handlers get_session and get_events are missing
rate-limit enforcement; call the existing limiter before processing requests.
Add a call to _limiter.check(RATE_READ) (or await if _limiter.check is async) at
the start of get_session and get_events, matching how GET /api/v1/sessions uses
RATE_READ, so the functions (_collector.get_trace, _collector.get_events) only
run after the limiter check and keep existing behavior and return types
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d306049c-778d-4114-8761-882897f075a0

📥 Commits

Reviewing files that changed from the base of the PR and between 46cd2fa and 8092e01.

📒 Files selected for processing (2)
  • agentwatch/api/server.py
  • tests/test_rate_limiting.py

Comment thread agentwatch/api/server.py
Comment thread agentwatch/api/server.py Outdated
Comment thread tests/test_rate_limiting.py Outdated
Comment thread tests/test_rate_limiting.py Outdated
Comment thread tests/test_rate_limiting.py Outdated
Comment thread tests/test_rate_limiting.py Outdated
harika880 and others added 2 commits June 3, 2026 13:05
…erevanth#56)

- Prune stale per-IP buckets with configurable TTL
- Resolve client IP via X-Forwarded-For when request.client is missing
- Return {error: rate_limit_exceeded} per issue spec
- Strengthen tests so 429 and headers are required

Co-authored-by: Cursor <cursoragent@cursor.com>
@ecc-tools
Copy link
Copy Markdown

ecc-tools Bot commented Jun 3, 2026

Analyzing 200 commits...

@ecc-tools
Copy link
Copy Markdown

ecc-tools Bot commented Jun 3, 2026

Analysis Complete

Generated ECC bundle from 3 commits | Confidence: 55%

View Pull Request #124

Repository Profile
Attribute Value
Language Python
Framework Not detected
Commit Convention mixed
Test Directory separate
Changed Files (3)
Metric Value
Files changed 3
Additions 219
Deletions 5

Top hotspots

Path Status +/-
agentwatch/api/server.py modified +124 / -5
tests/test_rate_limiting.py added +90 / -0
.gitignore modified +5 / -0

Top directories

Directory Files Total changes
agentwatch/api 1 129
tests 1 90
. 1 5
Analysis Depth Readiness (evidence-backed, 50%)

ECC Tools uses this to decide whether recommendations should stay at commit-history/setup guidance or expand into CI, security, harness, reference-set, AI-routing, and team backlog work.

Area Status Evidence / Next Step
Commit history Partial 3 commits sampled
CI/CD signals Ready .github/workflows/test-on-pr.yml
Security evidence Missing Add AgentShield, audit, SARIF, SBOM, or security review evidence so recommendations can cover security posture.
Harness configuration Ready tests/test_adapter_crewai.py
Reference/eval evidence Missing Add fixtures, golden traces, reference sets, or evaluator benchmarks so deeper recommendations have regression evidence.
AI routing and cost controls Ready tests/test_rate_limiting.py, agentwatch/cost/tracker.py
Team handoff and project tracking Missing Add roadmap, runbook, project, Linear, or follow-up tracking docs so generated work can land in a team queue.
Reference Set Readiness (1/7, 14%)
Area Status Evidence / Next Step
Deep analyzer corpus Missing Add analyzer fixture, golden, benchmark, or reference-set files that can catch analyzer regressions.
RAG/evaluator comparison Missing Add retrieval or evaluator reference-set comparison fixtures with expected ranking behavior.
PR salvage/review corpus Missing Add stale-PR, review-thread, reopen-flow, or salvage reference cases for queue cleanup automation.
Discussion triage corpus Missing Add public discussion triage fixtures, golden cases, or reference sets for informational, answered, and no-response classifications.
Harness compatibility Present tests/test_adapter_crewai.py
Security evidence Missing Attach security evidence such as SBOMs, SARIF, audit reports, or AgentShield evidence packs.
CI failure-mode evidence Missing Add captured CI failure logs, dry-run fixtures, or troubleshooting docs for common workflow failure modes.
Likely Future Issues (2)
Severity Signal Why it may show up
HIGH API contract changes may ship without integration coverage 1 API surface paths changed; 0 integration or e2e tests changed
MEDIUM API implementation changes may ship without contract artifact updates 1 API implementation paths changed; 0 API contract/spec files changed
  • API contract changes may ship without integration coverage: The PR changes API or route-facing files but does not touch any obvious integration or end-to-end tests.
  • API implementation changes may ship without contract artifact updates: The PR changes API implementation files but does not touch any obvious OpenAPI, GraphQL, or contract/spec artifact.
Suggested Follow-up Work (2)
Type Suggested title Targets
PR test: add integration coverage for agentwatch/api/server.py agentwatch/api/server.py
PR docs: sync API contract for agentwatch/api/server.py agentwatch/api/server.py
  • test: add integration coverage for agentwatch/api/server.py: Backfill integration or end-to-end coverage for the changed API surface before more contract changes land.
  • docs: sync API contract for agentwatch/api/server.py: Backfill the missing API contract or spec update before another implementation change lands on top of the same surface.

Copy-ready bodies

test: add integration coverage for agentwatch/api/server.py

## Summary
- Add integration or end-to-end coverage for the recently changed API surface.

## Why
- Backfill integration or end-to-end coverage for the changed API surface before more contract changes land.

## Touched paths
- `agentwatch/api/server.py`

## Validation
- Add or extend integration / e2e coverage for the changed API, route, or contract surface.
- Exercise the touched endpoints or route handlers against realistic request / response flows.

docs: sync API contract for agentwatch/api/server.py

## Summary
- Update the API contract artifact that should reflect the recently changed implementation surface.

## Why
- Backfill the missing API contract or spec update before another implementation change lands on top of the same surface.

## Touched paths
- `agentwatch/api/server.py`

## Validation
- Update the relevant OpenAPI, GraphQL, or contract/spec artifact used by this repo.
- Run the contract validation, docs generation, or API verification flow that depends on that artifact.
Review Activity (1 reviews, 6 inline comments, 0 unresolved threads)
Signal Count
Approvals 0
Change requests 0
Comment-only reviews 1
Dismissed reviews 0
Pending reviews 0
Review threads 6
Unresolved threads 0
Outdated threads 5
Latest review Commented
Latest submitted at 2026-06-02T06:32:29Z

Latest reviewer states

Reviewer State Submitted
@coderabbitai[bot] Commented 2026-06-02T06:32:29Z
Review Follow-up Signals (2)
Severity Signal Evidence
MEDIUM Re-check outdated review discussions 5 outdated review thread(s) may refer to older code
MEDIUM Get an explicit approval No approving review is recorded for this PR

Recommended next actions

  • Confirm the latest pushed code resolves stale review feedback before relying on old comments.
  • Ask for an approval after requested changes and unresolved discussions are addressed.
Detected Workflows (1)
Workflow Description
api-endpoint-rate-limiting-update Implements or fixes rate limiting logic for an API endpoint, including updating server logic and corresponding tests.
Generated Instincts (19)
Domain Count
git 3
code-style 9
testing 5
workflow 2

After merging, import with:

/instinct-import .claude/homunculus/instincts/inherited/AgentWatch-instincts.yaml

Files

  • .claude/ecc-tools.json
  • .claude/skills/AgentWatch/SKILL.md
  • .agents/skills/AgentWatch/SKILL.md
  • .agents/skills/AgentWatch/agents/openai.yaml
  • .claude/identity.json
  • .codex/config.toml
  • .codex/AGENTS.md
  • .codex/agents/explorer.toml
  • .codex/agents/reviewer.toml
  • .codex/agents/docs-researcher.toml
  • .claude/homunculus/instincts/inherited/AgentWatch-instincts.yaml
  • .claude/commands/api-endpoint-rate-limiting-update.md

ECC Tools | Everything Claude Code

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)

97-104: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Retry-After should use the remaining window.

Once a client is already partway through the bucket, always returning RATE_WINDOW_SEC tells them to wait longer than necessary. Derive the header from now - b["start"] so clients can back off accurately.

⏳ Proposed fix
                 headers={
                     "X-RateLimit-Limit": str(limit),
                     "X-RateLimit-Remaining": "0",
-                    "Retry-After": str(RATE_WINDOW_SEC),
+                    "Retry-After": str(max(1, RATE_WINDOW_SEC - int(now - b["start"]))),
                 },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/api/server.py` around lines 97 - 104, The Retry-After header
currently always returns RATE_WINDOW_SEC; instead compute the remaining window
using the bucket start timestamp: calculate now - b["start"] and set remaining =
max(0, RATE_WINDOW_SEC - elapsed) then use that remaining (as an integer/str)
for the "Retry-After" header; update the HTTPException headers in the rate limit
branch that references b, limit and RATE_WINDOW_SEC to return this derived
remaining value and ensure now is obtained (e.g., time.time()) before computing
elapsed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/api/server.py`:
- Around line 522-533: The system_status handler (and other handlers like
sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay,
/checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary,
/api/v1/governance/compliance-report, /api/v1/demo/seed) currently bypass the
rate limiter; either call _limiter.check(...) at the start of each of these
route functions (e.g., inside system_status) or refactor by creating a shared
FastAPI dependency or middleware that invokes _limiter.check (and keeps the
existing _require_api_key dependency) so every route automatically enforces the
limiter and new endpoints cannot omit it. Ensure you reference and invoke the
existing _limiter.check and _require_api_key symbols (or wire the limiter into
the global middleware stack) so all the listed handlers are protected.
- Around line 117-123: In _client_ip change the precedence so the socket peer is
used first: return request.client.host when request.client is present, and only
parse and use the x-forwarded-for header
(request.headers.get("x-forwarded-for")) as a fallback when request.client is
None or when explicit proxy-trust logic is enabled; ensure you still split on
commas and strip whitespace when using the header and return "unknown" if
neither source is available.

In `@tests/test_rate_limiting.py`:
- Around line 8-22: The fixture rate_client mutates server.RATE_WRITE and
server.RATE_READ and never restores them; change the fixture to save the
original values (e.g., orig_write = server.RATE_WRITE, orig_read =
server.RATE_READ) before overwriting, then use a yield-based teardown (yield the
TestClient) and after the yield restore server.RATE_WRITE = orig_write and
server.RATE_READ = orig_read and call server.reset_rate_limiter_for_tests()
again to revert global state so other tests aren’t affected; ensure the fixture
still sets env via monkeypatch and returns the TestClient during the yield.

---

Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 97-104: The Retry-After header currently always returns
RATE_WINDOW_SEC; instead compute the remaining window using the bucket start
timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC
- elapsed) then use that remaining (as an integer/str) for the "Retry-After"
header; update the HTTPException headers in the rate limit branch that
references b, limit and RATE_WINDOW_SEC to return this derived remaining value
and ensure now is obtained (e.g., time.time()) before computing elapsed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 08b7ee65-4119-465d-ace4-4b3b55320f96

📥 Commits

Reviewing files that changed from the base of the PR and between 8092e01 and 8b53fdb.

📒 Files selected for processing (3)
  • .gitignore
  • agentwatch/api/server.py
  • tests/test_rate_limiting.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agentwatch/api/server.py (1)

97-104: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Retry-After should use the remaining window.

Once a client is already partway through the bucket, always returning RATE_WINDOW_SEC tells them to wait longer than necessary. Derive the header from now - b["start"] so clients can back off accurately.

⏳ Proposed fix
                 headers={
                     "X-RateLimit-Limit": str(limit),
                     "X-RateLimit-Remaining": "0",
-                    "Retry-After": str(RATE_WINDOW_SEC),
+                    "Retry-After": str(max(1, RATE_WINDOW_SEC - int(now - b["start"]))),
                 },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/api/server.py` around lines 97 - 104, The Retry-After header
currently always returns RATE_WINDOW_SEC; instead compute the remaining window
using the bucket start timestamp: calculate now - b["start"] and set remaining =
max(0, RATE_WINDOW_SEC - elapsed) then use that remaining (as an integer/str)
for the "Retry-After" header; update the HTTPException headers in the rate limit
branch that references b, limit and RATE_WINDOW_SEC to return this derived
remaining value and ensure now is obtained (e.g., time.time()) before computing
elapsed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/api/server.py`:
- Around line 522-533: The system_status handler (and other handlers like
sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay,
/checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary,
/api/v1/governance/compliance-report, /api/v1/demo/seed) currently bypass the
rate limiter; either call _limiter.check(...) at the start of each of these
route functions (e.g., inside system_status) or refactor by creating a shared
FastAPI dependency or middleware that invokes _limiter.check (and keeps the
existing _require_api_key dependency) so every route automatically enforces the
limiter and new endpoints cannot omit it. Ensure you reference and invoke the
existing _limiter.check and _require_api_key symbols (or wire the limiter into
the global middleware stack) so all the listed handlers are protected.
- Around line 117-123: In _client_ip change the precedence so the socket peer is
used first: return request.client.host when request.client is present, and only
parse and use the x-forwarded-for header
(request.headers.get("x-forwarded-for")) as a fallback when request.client is
None or when explicit proxy-trust logic is enabled; ensure you still split on
commas and strip whitespace when using the header and return "unknown" if
neither source is available.

In `@tests/test_rate_limiting.py`:
- Around line 8-22: The fixture rate_client mutates server.RATE_WRITE and
server.RATE_READ and never restores them; change the fixture to save the
original values (e.g., orig_write = server.RATE_WRITE, orig_read =
server.RATE_READ) before overwriting, then use a yield-based teardown (yield the
TestClient) and after the yield restore server.RATE_WRITE = orig_write and
server.RATE_READ = orig_read and call server.reset_rate_limiter_for_tests()
again to revert global state so other tests aren’t affected; ensure the fixture
still sets env via monkeypatch and returns the TestClient during the yield.

---

Outside diff comments:
In `@agentwatch/api/server.py`:
- Around line 97-104: The Retry-After header currently always returns
RATE_WINDOW_SEC; instead compute the remaining window using the bucket start
timestamp: calculate now - b["start"] and set remaining = max(0, RATE_WINDOW_SEC
- elapsed) then use that remaining (as an integer/str) for the "Retry-After"
header; update the HTTPException headers in the rate limit branch that
references b, limit and RATE_WINDOW_SEC to return this derived remaining value
and ensure now is obtained (e.g., time.time()) before computing elapsed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 08b7ee65-4119-465d-ace4-4b3b55320f96

📥 Commits

Reviewing files that changed from the base of the PR and between 8092e01 and 8b53fdb.

📒 Files selected for processing (3)
  • .gitignore
  • agentwatch/api/server.py
  • tests/test_rate_limiting.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🛑 Comments failed to post (3)
agentwatch/api/server.py (2)

117-123: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't trust X-Forwarded-For ahead of the socket peer.

Preferring the header here lets a direct client pick a different IP on every request and sidestep the per-IP limiter. Use request.client.host first, and only fall back to X-Forwarded-For when request.client is unavailable or proxy trust is explicitly configured.

🔐 Proposed fix
 def _client_ip(request: Request) -> str:
-    forwarded = request.headers.get("x-forwarded-for")
-    if forwarded:
-        return forwarded.split(",")[0].strip()
     if request.client:
         return request.client.host
+    forwarded = request.headers.get("x-forwarded-for")
+    if forwarded:
+        candidate = forwarded.split(",")[0].strip()
+        if candidate:
+            return candidate
     return "unknown"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/api/server.py` around lines 117 - 123, In _client_ip change the
precedence so the socket peer is used first: return request.client.host when
request.client is present, and only parse and use the x-forwarded-for header
(request.headers.get("x-forwarded-for")) as a fallback when request.client is
None or when explicit proxy-trust logic is enabled; ensure you still split on
commas and strip whitespace when using the header and return "unknown" if
neither source is available.

522-533: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Several REST handlers still bypass the limiter.

This only limits /health, the session list/detail/events routes, and /api/v1/events. The other REST handlers changed below — for example /api/v1/system/status, /api/v1/sessions/{session_id}/trace, /confidence, /reasoning, /cost, /replay, /checkpoints, /rollback, /api/v1/safety/*, /api/v1/dashboard/summary, /api/v1/governance/compliance-report, and /api/v1/demo/seed — never call _limiter.check(...), so the REST surface is still partially unprotected and some of the heaviest read paths remain floodable.

Please either wire the remaining handlers directly or move the check into a shared dependency/middleware so new REST endpoints cannot miss it.

Also applies to: 624-813

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/api/server.py` around lines 522 - 533, The system_status handler
(and other handlers like sessions/{session_id}/trace, /confidence, /reasoning,
/cost, /replay, /checkpoints, /rollback, /api/v1/safety/*,
/api/v1/dashboard/summary, /api/v1/governance/compliance-report,
/api/v1/demo/seed) currently bypass the rate limiter; either call
_limiter.check(...) at the start of each of these route functions (e.g., inside
system_status) or refactor by creating a shared FastAPI dependency or middleware
that invokes _limiter.check (and keeps the existing _require_api_key dependency)
so every route automatically enforces the limiter and new endpoints cannot omit
it. Ensure you reference and invoke the existing _limiter.check and
_require_api_key symbols (or wire the limiter into the global middleware stack)
so all the listed handlers are protected.
tests/test_rate_limiting.py (1)

8-22: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the mutated server limits during fixture teardown.

Line 18 and Line 19 overwrite server.RATE_WRITE / server.RATE_READ, but the fixture never restores them. That leaks the test-only caps into later tests that import agentwatch.api.server, making the suite order-dependent.

♻️ Suggested cleanup in the fixture
+from collections.abc import Iterator
+
 `@pytest.fixture`
-def rate_client(monkeypatch: pytest.MonkeyPatch) -> TestClient:
+def rate_client(monkeypatch: pytest.MonkeyPatch) -> Iterator[TestClient]:
     """Fresh limiter with a low write cap so 429 is reachable quickly."""
     monkeypatch.setenv("API_RATE_LIMIT_WRITE", "10")
     monkeypatch.setenv("API_RATE_LIMIT_READ", "1000")
     monkeypatch.delenv("AGENTWATCH_API_KEY", raising=False)
     monkeypatch.delenv("AGENTWATCH_ENV", raising=False)

     import agentwatch.api.server as server

-    server.RATE_WRITE = int(os.getenv("API_RATE_LIMIT_WRITE", "10"))
-    server.RATE_READ = int(os.getenv("API_RATE_LIMIT_READ", "1000"))
-    server.reset_rate_limiter_for_tests()
-
-    return TestClient(server.app, raise_server_exceptions=False)
+    original_rate_write = server.RATE_WRITE
+    original_rate_read = server.RATE_READ
+    server.RATE_WRITE = int(os.getenv("API_RATE_LIMIT_WRITE", "10"))
+    server.RATE_READ = int(os.getenv("API_RATE_LIMIT_READ", "1000"))
+    server.reset_rate_limiter_for_tests()
+
+    try:
+        with TestClient(server.app, raise_server_exceptions=False) as client:
+            yield client
+    finally:
+        server.RATE_WRITE = original_rate_write
+        server.RATE_READ = original_rate_read
+        server.reset_rate_limiter_for_tests()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_rate_limiting.py` around lines 8 - 22, The fixture rate_client
mutates server.RATE_WRITE and server.RATE_READ and never restores them; change
the fixture to save the original values (e.g., orig_write = server.RATE_WRITE,
orig_read = server.RATE_READ) before overwriting, then use a yield-based
teardown (yield the TestClient) and after the yield restore server.RATE_WRITE =
orig_write and server.RATE_READ = orig_read and call
server.reset_rate_limiter_for_tests() again to revert global state so other
tests aren’t affected; ensure the fixture still sets env via monkeypatch and
returns the TestClient during the yield.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add rate limiting to REST API endpoints

1 participant