feat(audit): add audit logging middleware, API, and dashboard#242
feat(audit): add audit logging middleware, API, and dashboard#242tinu-hareesswar wants to merge 1 commit intomainfrom
Conversation
Add a complete audit system that captures all API requests/responses: - Postgres migration for `audit_log` table with indexes - Axum middleware that logs endpoint, method, headers (sanitized), request/response bodies, latency, merchant_id, and request_id - API endpoints: GET /audit/stats, /audit/requests, /audit/request/:id - React dashboard with sortable endpoint stats, expandable request explorer with JSON syntax highlighting, time range filters, and pagination Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds an end-to-end audit logging feature: database storage, backend capture/query APIs, and a frontend dashboard for exploring audit data.
Changes:
- Introduces
audit_logPostgres table migration with supporting indexes. - Adds Axum audit middleware to capture request/response metadata and new
/audit/*API endpoints to query stats and requests. - Adds a React
AuditPage, sidebar entry, route wiring, and Vite dev proxy for/audit.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| website/vite.config.ts | Adds Vite dev proxy rule for /audit to reach the backend locally. |
| website/src/components/pages/AuditPage.tsx | Implements the Audit dashboard UI (stats table + request explorer + pagination). |
| website/src/components/layout/Sidebar.tsx | Adds “Observability” section and Audit Log navigation link. |
| website/src/App.tsx | Registers the /audit route to render AuditPage. |
| src/routes/audit.rs | Adds backend audit query endpoints (/audit/stats, /audit/requests, /audit/request/:id). |
| src/routes.rs | Exposes the new routes::audit module. |
| src/middleware/mod.rs | Exposes the new middleware::audit module. |
| src/middleware/audit.rs | Adds request/response capture middleware and async insert into audit_log. |
| src/app.rs | Wires audit routes and installs the audit middleware into the service stack. |
| migrations_pg/2024_04_16_000000_create_audit_log/up.sql | Creates audit_log table + indexes. |
| migrations_pg/2024_04_16_000000_create_audit_log/down.sql | Drops audit_log table. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .bind::<Text, _>(ep.clone()) | ||
| .get_result(&*conn) | ||
| .await | ||
| .unwrap_or(CountRow { count: 0 }); |
There was a problem hiding this comment.
The total count query uses unwrap_or(CountRow { count: 0 }), which silently hides database errors and can cause the API to return an incorrect total while still returning request rows. It’s better to propagate the error (500) consistently, or at least log it and return an error response.
| .unwrap_or(CountRow { count: 0 }); | |
| .map_err(|e| { | |
| logger::error!("Failed to query audit request count: {:?}", e); | |
| ( | |
| StatusCode::INTERNAL_SERVER_ERROR, | |
| format!("Database query failed: {}", e), | |
| ) | |
| })?; |
| )) | ||
| .get_result(&*conn) | ||
| .await | ||
| .unwrap_or(CountRow { count: 0 }); |
There was a problem hiding this comment.
The total count query uses unwrap_or(CountRow { count: 0 }), which suppresses query failures and can mask real DB issues (returning total: 0 even when rows exist). Prefer handling this error the same way as the main query (log + return 500) to keep pagination metadata reliable.
| .unwrap_or(CountRow { count: 0 }); | |
| .map_err(|e| { | |
| logger::error!("Failed to query audit request count: {:?}", e); | |
| ( | |
| StatusCode::INTERNAL_SERVER_ERROR, | |
| format!("Database query failed: {}", e), | |
| ) | |
| })?; |
| const avgLatency = stats.length > 0 | ||
| ? Math.round(stats.reduce((sum, s) => sum + s.avg_latency_ms, 0) / stats.length) |
There was a problem hiding this comment.
The “Avg Latency” summary is computed as a simple average of per-endpoint averages, which is mathematically incorrect when endpoints have different hit counts. If this is meant to represent overall average latency, compute a weighted average using each endpoint’s count (or compute it server-side from raw rows).
| const avgLatency = stats.length > 0 | |
| ? Math.round(stats.reduce((sum, s) => sum + s.avg_latency_ms, 0) / stats.length) | |
| const avgLatency = totalRequests > 0 | |
| ? Math.round(stats.reduce((sum, s) => sum + (s.avg_latency_ms * s.count), 0) / totalRequests) |
| setRequestsTotal(data.total) | ||
| } | ||
| } catch { | ||
| setRequests([]) |
There was a problem hiding this comment.
On request fetch failure, the code clears requests but leaves requestsTotal unchanged. That can show stale totals/pagination from a previously expanded endpoint. Consider also resetting requestsTotal (and possibly requestsPage) in the catch/failure path.
| setRequests([]) | |
| setRequests([]) | |
| setRequestsTotal(0) |
| Err(e) => { | ||
| logger::error!("Failed to read request body for audit: {:?}", e); | ||
| let request = Request::from_parts(parts, Body::empty()); | ||
| return next.run(request).await; | ||
| } |
There was a problem hiding this comment.
If reading the request body fails (e.g., body exceeds the 1MB limit), the middleware reconstructs the request with an empty body and forwards it. That can break handlers that require the original payload. Consider avoiding consuming the body when it’s too large (e.g., check Content-Length / size hints first and skip body capture), or otherwise ensure the original body is still forwarded unchanged when audit capture fails.
| // Extract response body | ||
| let (resp_parts, resp_body) = response.into_parts(); | ||
| let resp_bytes = match to_bytes(resp_body, 1024 * 1024).await { | ||
| Ok(bytes) => bytes, | ||
| Err(e) => { | ||
| logger::error!("Failed to read response body for audit: {:?}", e); | ||
| let response = Response::from_parts(resp_parts, Body::empty()); | ||
| return response; |
There was a problem hiding this comment.
If reading the response body fails (commonly when it exceeds the 1MB limit), the middleware returns the response with an empty body. This changes API behavior for callers. Prefer returning the original response body unchanged and skipping audit body capture on failure (e.g., based on Content-Length / size hints or by gracefully falling back without altering the response).
| // Extract response body | |
| let (resp_parts, resp_body) = response.into_parts(); | |
| let resp_bytes = match to_bytes(resp_body, 1024 * 1024).await { | |
| Ok(bytes) => bytes, | |
| Err(e) => { | |
| logger::error!("Failed to read response body for audit: {:?}", e); | |
| let response = Response::from_parts(resp_parts, Body::empty()); | |
| return response; | |
| // Extract response body only when size hints indicate it is safe to do so. | |
| let response_content_length = response | |
| .headers() | |
| .get(axum::http::header::CONTENT_LENGTH) | |
| .and_then(|value| value.to_str().ok()) | |
| .and_then(|value| value.parse::<usize>().ok()); | |
| if matches!(response_content_length, Some(len) if len > 1024 * 1024) { | |
| logger::error!( | |
| "Skipping response body audit because Content-Length exceeds read limit: {:?}", | |
| response_content_length | |
| ); | |
| return response; | |
| } | |
| let (resp_parts, resp_body) = response.into_parts(); | |
| let resp_bytes = match to_bytes(resp_body, 1024 * 1024).await { | |
| Ok(bytes) => bytes, | |
| Err(e) => { | |
| logger::error!("Failed to read response body for audit: {:?}", e); | |
| return Response::from_parts(resp_parts, Body::empty()); |
| tokio::spawn(async move { | ||
| if let Err(e) = insert_audit_log( | ||
| &endpoint, | ||
| &method, | ||
| &req_headers_json, |
There was a problem hiding this comment.
This spawns a detached task per request to write the audit row. Under load, unbounded tokio::spawn can create large task backlogs and amplify DB pool pressure. Consider a bounded channel/worker (or batching) for audit inserts, with backpressure/drop policy, so audit logging can’t degrade request handling.
| .map_err(|e| { | ||
| logger::error!("Failed to query audit request: {:?}", e); | ||
| ( | ||
| StatusCode::NOT_FOUND, | ||
| "Audit log entry not found".to_string(), | ||
| ) |
There was a problem hiding this comment.
audit_request_by_id maps any Diesel error to a 404 Not Found. That will incorrectly return 404 on transient DB errors/timeouts and makes debugging harder. Consider returning 404 only for diesel::result::Error::NotFound, and otherwise return 500 (while still logging the underlying error).
| .map_err(|e| { | |
| logger::error!("Failed to query audit request: {:?}", e); | |
| ( | |
| StatusCode::NOT_FOUND, | |
| "Audit log entry not found".to_string(), | |
| ) | |
| .map_err(|e| match e { | |
| diesel::result::Error::NotFound => { | |
| logger::error!("Audit log entry not found for id {}: {:?}", id, e); | |
| ( | |
| StatusCode::NOT_FOUND, | |
| "Audit log entry not found".to_string(), | |
| ) | |
| } | |
| _ => { | |
| logger::error!("Failed to query audit request: {:?}", e); | |
| ( | |
| StatusCode::INTERNAL_SERVER_ERROR, | |
| "Failed to query audit log entry".to_string(), | |
| ) | |
| } |
Summary
audit_logtable migration with indexes on timestamp, endpoint, request_id, and merchant_idaudit_logasynchronously viatokio::spawnGET /audit/stats(per-endpoint hit counts with avg latency and error rates),GET /audit/requests(paginated request list),GET /audit/request/:id(single entry detail)AuditPagedashboard with sortable endpoint stats table, time range filter (1h/6h/24h/7d), search, expandable request explorer with full JSON request/response inspection, and paginationTest plan
up.sql) against a test database/audit/stats,/audit/requests, and/audit/request/:idreturn valid JSON/decide-gateway) and verify they appear in the audit logrequest_headers/auditpage and verify the dashboard renders🤖 Generated with Claude Code