fix: codebase analysis findings 2026-04-11#14
Conversation
- Fix critical WebSocket race condition: wrap connections with per-conn write mutex to prevent concurrent WriteMessage calls (gorilla/websocket requires single writer). Closes #13 partially. - Fix data race on prevCPUIdle/prevCPUTotal in SystemCollector.readCPU() by protecting reads/writes with existing mutex. - Fix fail2ban timestamp parsing bug: Go time.Parse uses '.' not ',' for fractional seconds. All timestamps were silently failing to parse. - Add http.MaxBytesReader on login endpoint to prevent memory exhaustion from oversized request bodies (capped at 4KB). - Add Secure and SameSite flags to logout cookie (was missing, inconsistent with login cookie). - Tighten DB directory permissions from 0755 to 0700. - Set db.SetMaxOpenConns(1) for SQLite to prevent "database is locked" errors. - Guard network rate calculation against uint64 underflow on counter wrap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughUpdated parsing, concurrency, and security-related behaviors across collectors, DB, HTTP server, and WebSocket hub: fail2ban timestamp parsing handles comma milliseconds; network/cpu collectors add monotonic checks and mutexing; DB directory perms and SQLite max open connections tightened; login body capped; WebSocket writes serialized via per-connection mutex wrapper. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant WSHandler
participant Hub
participant connWithMu
participant websocketConn as websocket.Conn
Client->>WSHandler: Upgrade request
WSHandler->>Hub: Register(websocket.Conn)
Hub-->>WSHandler: *connWithMu (wrapped)
WSHandler->>connWithMu: start read loop / defer Unregister(*connWithMu)
Note over Hub,connWithMu: Broadcast flow (concurrent callers)
Client->>Hub: Broadcast(message)
Hub->>connWithMu: acquire mu
connWithMu->>websocketConn: WriteMessage(message)
websocketConn-->>connWithMu: success / error
connWithMu->>Hub: release mu
alt write error
Hub->>Hub: enqueue/unregister failed *connWithMu
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR addresses several automated codebase analysis findings across the backend, focusing on concurrency safety, security hardening, correctness fixes in collectors, and improved SQLite operational stability.
Changes:
- Added per-WebSocket-connection write serialization to avoid concurrent
WriteMessagecalls during broadcast. - Hardened auth and session handling (login body size cap; logout cookie flags).
- Fixed collector correctness issues (Fail2ban timestamp parsing; network counter reset guard; CPU previous-sample data race) and tuned SQLite connection pooling.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/server/ws.go | Wraps WebSocket conns with a per-connection mutex to avoid concurrent writes during broadcast. |
| internal/server/server.go | Caps login request body size and adds missing Secure/SameSite flags to logout cookie. |
| internal/db/db.go | Tightens DB directory permissions and limits SQLite max open conns to reduce lock contention. |
| internal/collectors/system.go | Guards against network counter reset underflow and locks CPU previous-sample state to avoid data races. |
| internal/collectors/fail2ban.go | Fixes Fail2ban log timestamp parsing by converting comma milliseconds to Go’s fractional seconds format. |
Comments suppressed due to low confidence (1)
internal/server/server.go:150
http.MaxBytesReaderwill causeDecodeto fail with*http.MaxBytesErrorwhen the body exceeds 4KB, but the handler currently always returns 400. Consider detectingMaxBytesErrorand returningStatusRequestEntityTooLarge (413)(with a clearer error) so clients can distinguish oversized payloads from malformed JSON.
func (s *Server) handleLogin(w http.ResponseWriter, r *http.Request) {
r.Body = http.MaxBytesReader(w, r.Body, 4096)
var body struct {
Email string `json:"email"`
Password string `json:"password"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
writeJSON(w, http.StatusBadRequest, map[string]string{"error": "invalid body"})
return
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (h *Hub) Unregister(c *connWithMu) { | ||
| h.mu.Lock() | ||
| delete(h.clients, conn) | ||
| delete(h.clients, c) | ||
| h.mu.Unlock() | ||
| conn.Close() | ||
| c.conn.Close() |
There was a problem hiding this comment.
Unregister always closes the underlying connection even if it was already removed. Since Unregister can now be invoked both from the read-loop defer and from Broadcast() on write failure, this can lead to double-close (and close racing with an in-flight WriteMessage). Consider making Unregister idempotent (only close when the client was present) and coordinating Close() with the per-connection write mutex (e.g., lock c.mu while closing).
| // SQLite supports only one concurrent writer; limit pool to avoid "database is locked" | ||
| db.SetMaxOpenConns(1) | ||
|
|
||
| if err := db.Ping(); err != nil { |
There was a problem hiding this comment.
If db.Ping() fails, Open() returns without closing the *sql.DB, which can leak file descriptors / connections. Close the DB before returning the ping error (and optionally reset the pool settings if you add more).
| if err := db.Ping(); err != nil { | |
| if err := db.Ping(); err != nil { | |
| if closeErr := db.Close(); closeErr != nil { | |
| return nil, fmt.Errorf("ping db: %w; close db: %v", err, closeErr) | |
| } |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/server/server.go (1)
143-150:⚠️ Potential issue | 🟡 MinorHandle
*http.MaxBytesErrorto return413 Payload Too Largeinstead of400 Bad Request.The size cap is good, but oversized bodies are currently folded into a generic 400. When
http.MaxBytesReaderis exceeded,json.NewDecoder.Decodereturns a*http.MaxBytesErrorwhich should be detected explicitly to returnhttp.StatusRequestEntityTooLarge(413) for correct HTTP semantics.Proposed fix
import ( "encoding/json" + "errors" "fmt" ... ) func (s *Server) handleLogin(w http.ResponseWriter, r *http.Request) { r.Body = http.MaxBytesReader(w, r.Body, 4096) var body struct { Email string `json:"email"` Password string `json:"password"` } if err := json.NewDecoder(r.Body).Decode(&body); err != nil { + var maxErr *http.MaxBytesError + if errors.As(err, &maxErr) { + writeJSON(w, http.StatusRequestEntityTooLarge, map[string]string{"error": "request body too large"}) + return + } writeJSON(w, http.StatusBadRequest, map[string]string{"error": "invalid body"}) return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/server/server.go` around lines 143 - 150, When decoding the request body in the handler you must detect when json.NewDecoder(r.Body).Decode returns a *http.MaxBytesError and return http.StatusRequestEntityTooLarge (413) instead of the generic 400; update the block around r.Body = http.MaxBytesReader(w, r.Body, 4096) and the json.NewDecoder(r.Body).Decode(&body) call to type-assert the error to *http.MaxBytesError and call writeJSON(w, http.StatusRequestEntityTooLarge, ...) in that case, otherwise preserve the existing writeJSON(w, http.StatusBadRequest, ...) behavior for other decode errors.
🧹 Nitpick comments (3)
internal/collectors/fail2ban.go (2)
113-157: Check scanner errors after the loop.The scanner could encounter I/O errors while reading the log file, but
scanner.Err()is never checked after the loop completes. This could hide read failures and return incomplete results without indicating a problem.🛡️ Proposed fix to check scanner errors
for i, j := 0, len(events)-1; i < j; i, j = i+1, j-1 { events[i], events[j] = events[j], events[i] } + + if err := scanner.Err(); err != nil { + return nil, fmt.Errorf("reading fail2ban log: %w", err) + } return events, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/collectors/fail2ban.go` around lines 113 - 157, GetRecentBans in Fail2BanCollector reads the file with a bufio.Scanner but never checks scanner.Err() after the loop, so I/O errors are silently ignored; update GetRecentBans to call scanner.Err() once scanning completes and return that error (wrapped with context) if non-nil before returning events, ensuring the function propagates read errors from the scanner rather than silently returning possibly incomplete results.
121-149: Consider optimizing memory usage for large log files.The current implementation accumulates all matching events from the entire log file before trimming to the requested limit. For large log files with many ban events, this could use significant memory. A ring buffer or sliding window approach would keep only the last N events during scanning.
♻️ Example memory-efficient approach using a ring buffer
var events []BanEvent scanner := bufio.NewScanner(file) for scanner.Scan() { matches := banLogPattern.FindStringSubmatch(scanner.Text()) if matches == nil { continue } // fail2ban uses comma for milliseconds (e.g. "2024-01-15 10:30:45,123") // Go's time.Parse uses dot for fractional seconds, so replace comma tsStr := strings.Replace(matches[1], ",", ".", 1) ts, err := time.Parse("2006-01-02 15:04:05.000", tsStr) if err != nil { continue } events = append(events, BanEvent{ Timestamp: ts, Jail: matches[2], Action: strings.ToLower(matches[3]), IP: matches[4], }) + + // Keep only the last 'limit' events to reduce memory usage + if len(events) > limit { + events = events[1:] + } } - // Return last N events - if len(events) > limit { - events = events[len(events)-limit:] - } - // Reverse to get newest first for i, j := 0, len(events)-1; i < j; i, j = i+1, j-1 { events[i], events[j] = events[j], events[i] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/collectors/fail2ban.go` around lines 121 - 149, The code currently appends every matching BanEvent into the events slice then trims to limit, causing high memory use; change the scanning loop to maintain only the last N events by using a fixed-size ring/window: create a slice buffer of capacity max(1, limit) and an index counter (e.g., writeIdx) and on each match write buffer[writeIdx%limit] = BanEvent{...} and increment writeIdx; after scanning, reconstruct the ordered result slice from the circular buffer (handling cases when totalMatches < limit) so the returned events preserve chronological order; keep existing references to BanEvent, banLogPattern, scanner, ts parsing and limit when implementing this replacement.internal/server/ws.go (1)
66-82: Lock ordering is correct; consider async writes for improved latency.The
RLockis properly released beforeUnregistercalls, avoiding deadlock. However, holdingRLockwhile iterating and callingWriteMessagemeans slow clients block registration/unregistration of other clients. This is acceptable for modest client counts but could become a bottleneck at scale.An alternative pattern (not required now) would be to snapshot the client list, release the lock, then write to each client concurrently or sequentially without holding the hub lock.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/server/ws.go` around lines 66 - 82, Broadcast currently holds h.mu.RLock while iterating h.clients and calling conn.WriteMessage, which can block hub registration/unregistration; change Broadcast (method on type Hub) to first acquire h.mu.RLock, build a snapshot slice of *connWithMu from h.clients, release the lock, then iterate that snapshot and perform c.mu.Lock()/WriteMessage()/Unlock() outside the hub lock; to improve latency optionally perform the writes concurrently (spawn goroutines per connection and collect failures via a channel) and then call h.Unregister for failed connections after releasing any hub locks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/server/server.go`:
- Around line 143-150: When decoding the request body in the handler you must
detect when json.NewDecoder(r.Body).Decode returns a *http.MaxBytesError and
return http.StatusRequestEntityTooLarge (413) instead of the generic 400; update
the block around r.Body = http.MaxBytesReader(w, r.Body, 4096) and the
json.NewDecoder(r.Body).Decode(&body) call to type-assert the error to
*http.MaxBytesError and call writeJSON(w, http.StatusRequestEntityTooLarge, ...)
in that case, otherwise preserve the existing writeJSON(w,
http.StatusBadRequest, ...) behavior for other decode errors.
---
Nitpick comments:
In `@internal/collectors/fail2ban.go`:
- Around line 113-157: GetRecentBans in Fail2BanCollector reads the file with a
bufio.Scanner but never checks scanner.Err() after the loop, so I/O errors are
silently ignored; update GetRecentBans to call scanner.Err() once scanning
completes and return that error (wrapped with context) if non-nil before
returning events, ensuring the function propagates read errors from the scanner
rather than silently returning possibly incomplete results.
- Around line 121-149: The code currently appends every matching BanEvent into
the events slice then trims to limit, causing high memory use; change the
scanning loop to maintain only the last N events by using a fixed-size
ring/window: create a slice buffer of capacity max(1, limit) and an index
counter (e.g., writeIdx) and on each match write buffer[writeIdx%limit] =
BanEvent{...} and increment writeIdx; after scanning, reconstruct the ordered
result slice from the circular buffer (handling cases when totalMatches < limit)
so the returned events preserve chronological order; keep existing references to
BanEvent, banLogPattern, scanner, ts parsing and limit when implementing this
replacement.
In `@internal/server/ws.go`:
- Around line 66-82: Broadcast currently holds h.mu.RLock while iterating
h.clients and calling conn.WriteMessage, which can block hub
registration/unregistration; change Broadcast (method on type Hub) to first
acquire h.mu.RLock, build a snapshot slice of *connWithMu from h.clients,
release the lock, then iterate that snapshot and perform
c.mu.Lock()/WriteMessage()/Unlock() outside the hub lock; to improve latency
optionally perform the writes concurrently (spawn goroutines per connection and
collect failures via a channel) and then call h.Unregister for failed
connections after releasing any hub locks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d4cfddfa-8850-4a04-a0bc-7b899dc2fb01
📒 Files selected for processing (5)
internal/collectors/fail2ban.gointernal/collectors/system.gointernal/db/db.gointernal/server/server.gointernal/server/ws.go
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72001265f3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if prev, ok := c.netHistory[iface]; ok { | ||
| dt := now.Sub(prev.timestamp).Seconds() | ||
| if dt > 0 { | ||
| if dt > 0 && rxBytes >= prev.rxBytes && txBytes >= prev.txBytes { |
There was a problem hiding this comment.
Compute RX/TX rates independently on counter reset
The new guard in CollectNetwork requires both counters to be non-decreasing before computing either rate, so if only one counter resets (e.g., interface flap resets RX but TX continues), both RxRate and TxRate are forced to 0 for that sample. This introduces a metrics accuracy regression by dropping valid throughput for the unaffected direction; apply the reset check per counter so only the wrapped side is suppressed.
Useful? React with 👍 / 👎.
Closes #13 - show login errors in UI - parallelize Docker stats collection - add regression coverage for websocket, login limits, CPU race, fail2ban parsing, and Docker stats Co-authored-by: Codex <codex@openai.com>
Closes #13
Summary
Security
WriteMessagecallsprevCPUIdle/prevCPUTotalwith existing mutex inreadCPU()http.MaxBytesReader(4KB cap) to prevent memory exhaustionSecureandSameSiteattributesBugs
Performance
MaxOpenConns(1)to prevent "database is locked" errorsTest plan
Generato automaticamente da Codebase Analyst
Summary by CodeRabbit
Bug Fixes
Security
Improvements
Tests