Skip to content

fix: codebase analysis findings 2026-04-11#14

Merged
LiukScot merged 2 commits intomainfrom
codebase-analysis/fixes-2026-04-11
Apr 20, 2026
Merged

fix: codebase analysis findings 2026-04-11#14
LiukScot merged 2 commits intomainfrom
codebase-analysis/fixes-2026-04-11

Conversation

@LiukScot
Copy link
Copy Markdown
Owner

@LiukScot LiukScot commented Apr 11, 2026

Closes #13

Summary

Security

  • WebSocket race condition (critical): Added per-connection write mutex to prevent concurrent WriteMessage calls
  • CPU data race (critical): Protected prevCPUIdle/prevCPUTotal with existing mutex in readCPU()
  • Login body size limit: Added http.MaxBytesReader (4KB cap) to prevent memory exhaustion
  • Logout cookie flags: Added missing Secure and SameSite attributes
  • DB directory permissions: Tightened from 0755 to 0700

Bugs

  • Fail2ban timestamp parsing: Fixed Go time format — comma replaced with dot for fractional seconds (all events were silently skipped)
  • Network rate underflow: Added guard against uint64 wrap on counter reset

Performance

  • SQLite connection pool: Set MaxOpenConns(1) to prevent "database is locked" errors

Test plan

  • Verify WebSocket metrics still broadcast correctly with multiple clients
  • Confirm fail2ban recent bans endpoint returns events (was broken before)
  • Check login still works and rejects oversized bodies
  • Verify logout properly clears cookie

Generato automaticamente da Codebase Analyst

Summary by CodeRabbit

  • Bug Fixes

    • Fixed fail2ban timestamp parsing with comma milliseconds
    • Prevented invalid network rate calculations when counters regress
    • Ensured thread-safe WebSocket writes and CPU metric reads
  • Security

    • Tightened database directory permissions
    • Added request size limits to login endpoint
    • Strengthened logout cookie attributes
  • Improvements

    • Constrained DB connections for SQLite
    • Gathered container stats concurrently for faster collection
  • Tests

    • Added concurrency and parsing tests for collectors, server, and Docker stats

- Fix critical WebSocket race condition: wrap connections with per-conn
  write mutex to prevent concurrent WriteMessage calls (gorilla/websocket
  requires single writer). Closes #13 partially.
- Fix data race on prevCPUIdle/prevCPUTotal in SystemCollector.readCPU()
  by protecting reads/writes with existing mutex.
- Fix fail2ban timestamp parsing bug: Go time.Parse uses '.' not ','
  for fractional seconds. All timestamps were silently failing to parse.
- Add http.MaxBytesReader on login endpoint to prevent memory exhaustion
  from oversized request bodies (capped at 4KB).
- Add Secure and SameSite flags to logout cookie (was missing, inconsistent
  with login cookie).
- Tighten DB directory permissions from 0755 to 0700.
- Set db.SetMaxOpenConns(1) for SQLite to prevent "database is locked" errors.
- Guard network rate calculation against uint64 underflow on counter wrap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 11, 2026 00:06
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Updated parsing, concurrency, and security-related behaviors across collectors, DB, HTTP server, and WebSocket hub: fail2ban timestamp parsing handles comma milliseconds; network/cpu collectors add monotonic checks and mutexing; DB directory perms and SQLite max open connections tightened; login body capped; WebSocket writes serialized via per-connection mutex wrapper.

Changes

Cohort / File(s) Summary
Fail2ban & System collectors
internal/collectors/fail2ban.go, internal/collectors/system.go
Fail2ban timestamp parsing now converts the first comma to a dot before parsing milliseconds; network RX/TX rate computation requires monotonic counters; CPU read/update now uses an explicit mutex around prevCPU fields.
Docker collector & tests
internal/collectors/docker.go, internal/collectors/docker_test.go
Gather container stats concurrently with a semaphore (maxConcurrentStatsRequests = 4); new test asserting parallel collection and timing expectations.
Fail2ban tests
internal/collectors/fail2ban_test.go
New test verifying parsing of timestamps with comma-separated milliseconds.
System collector tests
internal/collectors/system_test.go
New concurrency test exercising readCPU under concurrent invocations.
Database
internal/db/db.go
DB directory creation permission changed from 0755 to 0700; SQLite pool limited with db.SetMaxOpenConns(1) before Ping.
HTTP Server & tests
internal/server/server.go, internal/server/server_test.go
handleLogin wraps r.Body with http.MaxBytesReader(..., 4096); handleLogout sets Secure and SameSite=Lax on deletion cookie; tests added for oversized body rejection and Hub broadcast concurrency.
WebSocket hub & handler
internal/server/ws.go
Introduced connWithMu (conn + mutex); Hub.clients now map[*connWithMu]bool; Register returns *connWithMu; Unregister accepts *connWithMu; Broadcast locks per-connection during WriteMessage and unregisters failed connections.
Frontend login UX
frontend/src/routes/+layout.svelte
Added reactive loginError state, cleared on submit, displayed on failure; UI shows error message under password field.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant WSHandler
    participant Hub
    participant connWithMu
    participant websocketConn as websocket.Conn

    Client->>WSHandler: Upgrade request
    WSHandler->>Hub: Register(websocket.Conn)
    Hub-->>WSHandler: *connWithMu (wrapped)
    WSHandler->>connWithMu: start read loop / defer Unregister(*connWithMu)

    Note over Hub,connWithMu: Broadcast flow (concurrent callers)
    Client->>Hub: Broadcast(message)
    Hub->>connWithMu: acquire mu
    connWithMu->>websocketConn: WriteMessage(message)
    websocketConn-->>connWithMu: success / error
    connWithMu->>Hub: release mu
    alt write error
        Hub->>Hub: enqueue/unregister failed *connWithMu
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰
I nibbled logs and found a comma bright,
I swapped it for a dot and parsed the night,
I hopped through mutexes to keep the threads in line,
I capped the bytes so requests behave fine,
I wrapped each socket — now we all sleep tight.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix: codebase analysis findings 2026-04-11' is vague and generic, using a date reference without clearly conveying the nature of the changes to someone scanning history. Use a more descriptive title that highlights the primary changes, such as 'fix: address security and concurrency issues' or 'fix: WebSocket race condition, timestamp parsing, and request limits'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All major coding requirements from issue #13 are implemented: WebSocket write mutex, CPU race protection, fail2ban timestamp parsing, network counter underflow guard, login request size limit, logout cookie attributes, DB permissions, SQLite pool config, Docker concurrent stats, and comprehensive tests.
Out of Scope Changes check ✅ Passed All changes directly address objectives from issue #13. Frontend login error display and Docker concurrent collection are explicitly mentioned in the PR summary commit message as follow-ups to the issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codebase-analysis/fixes-2026-04-11

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses several automated codebase analysis findings across the backend, focusing on concurrency safety, security hardening, correctness fixes in collectors, and improved SQLite operational stability.

Changes:

  • Added per-WebSocket-connection write serialization to avoid concurrent WriteMessage calls during broadcast.
  • Hardened auth and session handling (login body size cap; logout cookie flags).
  • Fixed collector correctness issues (Fail2ban timestamp parsing; network counter reset guard; CPU previous-sample data race) and tuned SQLite connection pooling.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/server/ws.go Wraps WebSocket conns with a per-connection mutex to avoid concurrent writes during broadcast.
internal/server/server.go Caps login request body size and adds missing Secure/SameSite flags to logout cookie.
internal/db/db.go Tightens DB directory permissions and limits SQLite max open conns to reduce lock contention.
internal/collectors/system.go Guards against network counter reset underflow and locks CPU previous-sample state to avoid data races.
internal/collectors/fail2ban.go Fixes Fail2ban log timestamp parsing by converting comma milliseconds to Go’s fractional seconds format.
Comments suppressed due to low confidence (1)

internal/server/server.go:150

  • http.MaxBytesReader will cause Decode to fail with *http.MaxBytesError when the body exceeds 4KB, but the handler currently always returns 400. Consider detecting MaxBytesError and returning StatusRequestEntityTooLarge (413) (with a clearer error) so clients can distinguish oversized payloads from malformed JSON.
func (s *Server) handleLogin(w http.ResponseWriter, r *http.Request) {
	r.Body = http.MaxBytesReader(w, r.Body, 4096)
	var body struct {
		Email    string `json:"email"`
		Password string `json:"password"`
	}
	if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
		writeJSON(w, http.StatusBadRequest, map[string]string{"error": "invalid body"})
		return

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/server/ws.go
Comment on lines +59 to +63
func (h *Hub) Unregister(c *connWithMu) {
h.mu.Lock()
delete(h.clients, conn)
delete(h.clients, c)
h.mu.Unlock()
conn.Close()
c.conn.Close()
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unregister always closes the underlying connection even if it was already removed. Since Unregister can now be invoked both from the read-loop defer and from Broadcast() on write failure, this can lead to double-close (and close racing with an in-flight WriteMessage). Consider making Unregister idempotent (only close when the client was present) and coordinating Close() with the per-connection write mutex (e.g., lock c.mu while closing).

Copilot uses AI. Check for mistakes.
Comment thread internal/db/db.go
// SQLite supports only one concurrent writer; limit pool to avoid "database is locked"
db.SetMaxOpenConns(1)

if err := db.Ping(); err != nil {
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If db.Ping() fails, Open() returns without closing the *sql.DB, which can leak file descriptors / connections. Close the DB before returning the ping error (and optionally reset the pool settings if you add more).

Suggested change
if err := db.Ping(); err != nil {
if err := db.Ping(); err != nil {
if closeErr := db.Close(); closeErr != nil {
return nil, fmt.Errorf("ping db: %w; close db: %v", err, closeErr)
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/server/server.go (1)

143-150: ⚠️ Potential issue | 🟡 Minor

Handle *http.MaxBytesError to return 413 Payload Too Large instead of 400 Bad Request.

The size cap is good, but oversized bodies are currently folded into a generic 400. When http.MaxBytesReader is exceeded, json.NewDecoder.Decode returns a *http.MaxBytesError which should be detected explicitly to return http.StatusRequestEntityTooLarge (413) for correct HTTP semantics.

Proposed fix
import (
	"encoding/json"
+	"errors"
	"fmt"
	...
)

func (s *Server) handleLogin(w http.ResponseWriter, r *http.Request) {
	r.Body = http.MaxBytesReader(w, r.Body, 4096)
	var body struct {
		Email    string `json:"email"`
		Password string `json:"password"`
	}
	if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
+		var maxErr *http.MaxBytesError
+		if errors.As(err, &maxErr) {
+			writeJSON(w, http.StatusRequestEntityTooLarge, map[string]string{"error": "request body too large"})
+			return
+		}
		writeJSON(w, http.StatusBadRequest, map[string]string{"error": "invalid body"})
		return
	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/server/server.go` around lines 143 - 150, When decoding the request
body in the handler you must detect when json.NewDecoder(r.Body).Decode returns
a *http.MaxBytesError and return http.StatusRequestEntityTooLarge (413) instead
of the generic 400; update the block around r.Body = http.MaxBytesReader(w,
r.Body, 4096) and the json.NewDecoder(r.Body).Decode(&body) call to type-assert
the error to *http.MaxBytesError and call writeJSON(w,
http.StatusRequestEntityTooLarge, ...) in that case, otherwise preserve the
existing writeJSON(w, http.StatusBadRequest, ...) behavior for other decode
errors.
🧹 Nitpick comments (3)
internal/collectors/fail2ban.go (2)

113-157: Check scanner errors after the loop.

The scanner could encounter I/O errors while reading the log file, but scanner.Err() is never checked after the loop completes. This could hide read failures and return incomplete results without indicating a problem.

🛡️ Proposed fix to check scanner errors
 	for i, j := 0, len(events)-1; i < j; i, j = i+1, j-1 {
 		events[i], events[j] = events[j], events[i]
 	}
+
+	if err := scanner.Err(); err != nil {
+		return nil, fmt.Errorf("reading fail2ban log: %w", err)
+	}
 
 	return events, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/collectors/fail2ban.go` around lines 113 - 157, GetRecentBans in
Fail2BanCollector reads the file with a bufio.Scanner but never checks
scanner.Err() after the loop, so I/O errors are silently ignored; update
GetRecentBans to call scanner.Err() once scanning completes and return that
error (wrapped with context) if non-nil before returning events, ensuring the
function propagates read errors from the scanner rather than silently returning
possibly incomplete results.

121-149: Consider optimizing memory usage for large log files.

The current implementation accumulates all matching events from the entire log file before trimming to the requested limit. For large log files with many ban events, this could use significant memory. A ring buffer or sliding window approach would keep only the last N events during scanning.

♻️ Example memory-efficient approach using a ring buffer
 	var events []BanEvent
 	scanner := bufio.NewScanner(file)
 
 	for scanner.Scan() {
 		matches := banLogPattern.FindStringSubmatch(scanner.Text())
 		if matches == nil {
 			continue
 		}
 
 		// fail2ban uses comma for milliseconds (e.g. "2024-01-15 10:30:45,123")
 		// Go's time.Parse uses dot for fractional seconds, so replace comma
 		tsStr := strings.Replace(matches[1], ",", ".", 1)
 		ts, err := time.Parse("2006-01-02 15:04:05.000", tsStr)
 		if err != nil {
 			continue
 		}
 
 		events = append(events, BanEvent{
 			Timestamp: ts,
 			Jail:      matches[2],
 			Action:    strings.ToLower(matches[3]),
 			IP:        matches[4],
 		})
+
+		// Keep only the last 'limit' events to reduce memory usage
+		if len(events) > limit {
+			events = events[1:]
+		}
 	}
 
-	// Return last N events
-	if len(events) > limit {
-		events = events[len(events)-limit:]
-	}
-
 	// Reverse to get newest first
 	for i, j := 0, len(events)-1; i < j; i, j = i+1, j-1 {
 		events[i], events[j] = events[j], events[i]
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/collectors/fail2ban.go` around lines 121 - 149, The code currently
appends every matching BanEvent into the events slice then trims to limit,
causing high memory use; change the scanning loop to maintain only the last N
events by using a fixed-size ring/window: create a slice buffer of capacity
max(1, limit) and an index counter (e.g., writeIdx) and on each match write
buffer[writeIdx%limit] = BanEvent{...} and increment writeIdx; after scanning,
reconstruct the ordered result slice from the circular buffer (handling cases
when totalMatches < limit) so the returned events preserve chronological order;
keep existing references to BanEvent, banLogPattern, scanner, ts parsing and
limit when implementing this replacement.
internal/server/ws.go (1)

66-82: Lock ordering is correct; consider async writes for improved latency.

The RLock is properly released before Unregister calls, avoiding deadlock. However, holding RLock while iterating and calling WriteMessage means slow clients block registration/unregistration of other clients. This is acceptable for modest client counts but could become a bottleneck at scale.

An alternative pattern (not required now) would be to snapshot the client list, release the lock, then write to each client concurrently or sequentially without holding the hub lock.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/server/ws.go` around lines 66 - 82, Broadcast currently holds
h.mu.RLock while iterating h.clients and calling conn.WriteMessage, which can
block hub registration/unregistration; change Broadcast (method on type Hub) to
first acquire h.mu.RLock, build a snapshot slice of *connWithMu from h.clients,
release the lock, then iterate that snapshot and perform
c.mu.Lock()/WriteMessage()/Unlock() outside the hub lock; to improve latency
optionally perform the writes concurrently (spawn goroutines per connection and
collect failures via a channel) and then call h.Unregister for failed
connections after releasing any hub locks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/server/server.go`:
- Around line 143-150: When decoding the request body in the handler you must
detect when json.NewDecoder(r.Body).Decode returns a *http.MaxBytesError and
return http.StatusRequestEntityTooLarge (413) instead of the generic 400; update
the block around r.Body = http.MaxBytesReader(w, r.Body, 4096) and the
json.NewDecoder(r.Body).Decode(&body) call to type-assert the error to
*http.MaxBytesError and call writeJSON(w, http.StatusRequestEntityTooLarge, ...)
in that case, otherwise preserve the existing writeJSON(w,
http.StatusBadRequest, ...) behavior for other decode errors.

---

Nitpick comments:
In `@internal/collectors/fail2ban.go`:
- Around line 113-157: GetRecentBans in Fail2BanCollector reads the file with a
bufio.Scanner but never checks scanner.Err() after the loop, so I/O errors are
silently ignored; update GetRecentBans to call scanner.Err() once scanning
completes and return that error (wrapped with context) if non-nil before
returning events, ensuring the function propagates read errors from the scanner
rather than silently returning possibly incomplete results.
- Around line 121-149: The code currently appends every matching BanEvent into
the events slice then trims to limit, causing high memory use; change the
scanning loop to maintain only the last N events by using a fixed-size
ring/window: create a slice buffer of capacity max(1, limit) and an index
counter (e.g., writeIdx) and on each match write buffer[writeIdx%limit] =
BanEvent{...} and increment writeIdx; after scanning, reconstruct the ordered
result slice from the circular buffer (handling cases when totalMatches < limit)
so the returned events preserve chronological order; keep existing references to
BanEvent, banLogPattern, scanner, ts parsing and limit when implementing this
replacement.

In `@internal/server/ws.go`:
- Around line 66-82: Broadcast currently holds h.mu.RLock while iterating
h.clients and calling conn.WriteMessage, which can block hub
registration/unregistration; change Broadcast (method on type Hub) to first
acquire h.mu.RLock, build a snapshot slice of *connWithMu from h.clients,
release the lock, then iterate that snapshot and perform
c.mu.Lock()/WriteMessage()/Unlock() outside the hub lock; to improve latency
optionally perform the writes concurrently (spawn goroutines per connection and
collect failures via a channel) and then call h.Unregister for failed
connections after releasing any hub locks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d4cfddfa-8850-4a04-a0bc-7b899dc2fb01

📥 Commits

Reviewing files that changed from the base of the PR and between acf3a3f and 7200126.

📒 Files selected for processing (5)
  • internal/collectors/fail2ban.go
  • internal/collectors/system.go
  • internal/db/db.go
  • internal/server/server.go
  • internal/server/ws.go

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72001265f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if prev, ok := c.netHistory[iface]; ok {
dt := now.Sub(prev.timestamp).Seconds()
if dt > 0 {
if dt > 0 && rxBytes >= prev.rxBytes && txBytes >= prev.txBytes {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute RX/TX rates independently on counter reset

The new guard in CollectNetwork requires both counters to be non-decreasing before computing either rate, so if only one counter resets (e.g., interface flap resets RX but TX continues), both RxRate and TxRate are forced to 0 for that sample. This introduces a metrics accuracy regression by dropping valid throughput for the unaffected direction; apply the reset check per counter so only the wrapped side is suppressed.

Useful? React with 👍 / 👎.

Closes #13

- show login errors in UI
- parallelize Docker stats collection
- add regression coverage for websocket, login limits, CPU race, fail2ban parsing, and Docker stats

Co-authored-by: Codex <codex@openai.com>
@LiukScot LiukScot merged commit 4e0d53b into main Apr 20, 2026
1 check was pending
@LiukScot LiukScot deleted the codebase-analysis/fixes-2026-04-11 branch April 20, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Codebase Analysis] Findings — 2026-04-11

2 participants