Review fixes: close leaks, UAF race, framing, sanitizer CI by nyo16 · Pull Request #4 · nyo16/net_runner

nyo16 · 2026-04-17T00:56:43Z

Summary

Implements the findings from the professional code review in
/Users/nmaroulis/.claude/plans/do-a-thoroough-code-wild-bengio.md,
addressing zero-leak / zero-zombie guarantees across the NIF, shepherd,
and Elixir layers. 18 commits, each a focused fix.

Critical fixes (correctness)

FD leak on mutex alloc failure (nif_create_fd) — previously the
dtor gated close() on res->lock being non-NULL, so an ENOMEM
during mutex create leaked the fd and armed a NULL-deref in any
later nif_close.
Use-after-close race — nif_read/nif_write copied fd out of
the lock before the syscall; a concurrent nif_close / owner-death
could close the fd before the syscall ran, hitting a recycled fd.
Lock is now held across the syscall and enif_select; the actual
close() runs in the io_resource_stop callback after BEAM drains
any pending select.
Lost initial stderr in :consume mode — kick_stderr_read sent
{:stderr_data, _} to self() but no handle_info matched, so
first-chunk stderr for short-lived processes was dropped.
write_loop {:ok, 0} spin — if the kernel ever returned 0 on
a non-empty write, the GenServer spun forever. Bounded with a 1 ms
sleep-retry.

High-severity fixes (shepherd C layer)

Command framing — shepherd parsed only buf[0], discarding any
coalesced/tail commands. Now frames are length-dispatched per
opcode with a 64-byte carry-over buffer across poll() iterations.
sendmsg in send_fds — EINTR loop, verify full 1-byte send,
log errno on failure instead of returning an opaque -1.
Post-fork signal safety — replaced fprintf/strerror in the
post-fork child window with a write(2)-based child_fail()
helper (async-signal-safe). Checked every dup2 / setsid /
TIOCSCTTY return; on failure the child exits 127 with a
diagnostic instead of running with broken stdio.
waitpid after SIGKILL — bounded WNOHANG loop (~3 s cap) so
shepherd can't hang on a D-state child.
SIGCHLD reaping — loop waitpid(-1, ..., WNOHANG) per drain;
previously one signal could leave zombies if ever coalesced.
Cgroup & UDS path hardening — validate every snprintf return,
reject too-long UDS paths, set FD_CLOEXEC on PTY master, treat
user-requested cgroup setup failure as fatal, and replace the fixed
100 ms usleep in cgroup_cleanup with a bounded polling rmdir.

Elixir layer hardening

Stream consumer crash cleanup — Process GenServer now accepts an
optional :owner pid, monitors it, and SIGKILL-stops the OS process
on DOWN. NetRunner.Stream.stream/3 passes self() so a consumer
crash tears down the chain even though Stream.resource's after
didn't fire.
Watcher non-blocking escalation — Process.sleep(5_000) in
handle_info replaced with Process.send_after/3 + a new clause.
Parked-caller monitoring — Operations.park/4 now
Process.monitors the caller. Dead callers are pruned on DOWN
instead of accumulating in the pending map.
read_uds_message — replaced peek-then-recv (race-prone when
the payload arrives a moment after the opcode) with opcode-first
reads and proper timeouts.
Input validation — reject non-binary / empty / NUL-containing
cmd and args at the spawn boundary.
File.rm error handling — tolerate :enoent on the UDS socket
path, propagate real errors.
Signal source of truth — Signal.resolve delegates to the NIF
instead of maintaining a duplicate allow-list; integer range
clamped to 1..31.
Daemon drain resilience — warn (and keep running) when a drain
task dies abnormally; wrap drain_loop so reader/logger exceptions
can't take the daemon down via Task link.
terminate/2 explicitly closes the shepherd Port after the
UDS socket so the teardown order is deterministic.

Tooling

ASan + UBSan build: SANITIZE=1 make all (or make asan). CI
gains a sanitizers job that builds the NIF and shepherd with
AddressSanitizer + UBSan, preloads libasan, and runs the full
mix test. The publish job depends on it.
Stale socket sweep in test_helper.exs (before + after suite).
New test: binary-with-NUL-byte round-trip through the shepherd.

Test plan

mix compile --warnings-as-errors
mix format --check-formatted
mix credo --strict (clean)
mix dialyzer (0 errors)
mix test — 120 passed, 10× consecutive runs all green
SANITIZE=1 make clean all builds locally on macOS
CI sanitizers job runs clean on Linux (verified on push)

If enif_mutex_create returned NULL the resource was returned to Elixir with res->lock == NULL and res->fd set. The destructor gated close() on a non-NULL lock, so the FD leaked; any subsequent nif_close would also dereference NULL and crash. Now nif_create_fd checks the mutex result, releases the resource on failure (so dtor reclaims the fd), and returns {:error, :mutex_failed}. io_resource_dtor no longer gates close() on lock presence.

nif_read / nif_write copied res->fd under lock and released the lock before the read()/write() syscall. A concurrent nif_close or owner-death down callback could close the fd on another thread before the syscall ran, letting the read/write target a recycled fd. Restructure so the mutex is held across the syscall and the subsequent enif_select registration. Since fds are O_NONBLOCK, the lock is held only briefly. Move the actual close() into the enif_select stop callback. nif_close and io_resource_down now mark the resource closed, detach the fd from res->fd, release the lock, and call enif_select(SELECT_STOP). BEAM waits for any in-flight select to drain before running the stop callback, which closes the fd safely.

In :consume mode, kick_stderr_read is called from init/1 to arm enif_select. If the initial Pipe.read returned {:ok, data} synchronously (short-lived processes whose stderr is already in the pipe buffer), it sent {:stderr_data, data} to self() but no handle_info clause matched, so the generic catch-all dropped it. The first (and sometimes only) chunk of stderr was lost, corrupting AbnormalExit messages and run/2 stderr capture for fast-exiting commands. Add the missing handle_info({:stderr_data, _}, state) clause that appends to stderr_buffer, records the bytes in stats, and drains any remaining data via consume_stderr. Also pass @default_read_size explicitly from kick_stderr_read instead of relying on Pipe.read/1's implicit default, keeping read sizes consistent across the module.

Two issues in the stdin write path: 1. If nif_write returned {:ok, 0} on a non-empty buffer (legal for write(2) on non-regular files), write_loop would recurse forever with the same data, wedging the GenServer. Park the caller and wait for :ready_output instead of spinning. 2. On repeated partial writes, write_loop kept recursing inside handle_call, delaying unrelated messages (kill, await_exit, :select) until either completion or EAGAIN. After one successful partial, park the caller with the remainder and let :ready_output drive subsequent writes. Apply the same shape to retry_write_loop on the :ready_output path.

The event loop read up to 16 bytes from the UDS socket and passed them straight to handle_command, which only parsed buf[0] and ignored the remainder. Coalesced commands (e.g. CMD_CLOSE_STDIN followed by CMD_KILL delivered in a single read) dropped everything after the first opcode. A partial tail crossing a read boundary was also lost. Introduce command_length() per opcode, a loop-parsing handle_commands() helper, and a 64-byte carry-over buffer in event_loop() that retains any truncated tail across reads. Unknown opcodes advance one byte so the parser cannot stall on bad input. read() errors other than EINTR / EAGAIN now kill the child and exit rather than silently looping.

send_fds: * Loop on EINTR and require ret == 1 (the full 1-byte send) instead of accepting any positive return. * Log errno on both calloc and sendmsg failures so the caller sees a diagnostic instead of an opaque "failed to send FDs". Post-fork child: * Between fork() and execvp(), POSIX allows only async-signal-safe functions. Replace fprintf/strerror on failure with a new child_fail() helper that uses only write(2) on a stack buffer. * Check setsid, TIOCSCTTY, and every dup2 return value. If any of these fail the child now reports via child_fail() and _exit(127) instead of silently running with the wrong stdio or no controlling terminal.

kill_child() previously called waitpid(child_pid, NULL, 0) after SIGKILL. If the child was in uninterruptible kernel sleep (D-state), this blocked the shepherd indefinitely — leaking the process group and preventing any further protocol communication with BEAM. Replace with a bounded WNOHANG reap loop (~3s cap) so control returns even when the kernel cannot reap immediately; cgroup_cleanup() provides the authoritative kill path on Linux. The event-loop SIGCHLD handler reaped at most one child per drain. SIGCHLD is coalesced by the kernel, so multiple pending exits could deliver as a single signal; the single waitpid missed them. Loop with waitpid(-1, ..., WNOHANG) until no more reapable children remain and flip child_exited only for the managed pid.

* Validate every snprintf return value for cgroup full_path / procs_path / kill_path; treat truncation as error instead of silently writing to the wrong file. * Validate uds_path length before strncpy into sockaddr_un.sun_path; reject paths that would truncate rather than failing opaquely in connect(). * Replace the fixed 100ms usleep before rmdir in cgroup_cleanup() with a bounded polling loop (~1s) that retries on EBUSY / ENOTEMPTY. Stops cgroup directories leaking when the kernel reaper is slow. * Treat cgroup_setup failure as fatal for the spawn (send_error, kill child, exit). A user who supplied a cgroup_path expects isolation; silently running unisolated violates the contract. * Set FD_CLOEXEC on the PTY master fd so any future shepherd fork doesn't leak it.

Stream.resource's after callback is only run on normal termination or {:halt, _}. If the consumer crashed mid-iteration (e.g. pattern-match failure), cleanup_process/1 was never invoked, the Process GenServer lived on, and Watcher couldn't help because it monitors the GenServer, not the consumer. Add an optional :owner option to NetRunner.Process that takes a pid to monitor. On DOWN the GenServer SIGKILLs the OS process (through both the shepherd protocol and a direct NIF kill) and stops :normal. NetRunner.Stream.stream/3 now passes self() as :owner so consumer crashes reliably tear down the whole chain.

process.ex terminate/2 now explicitly closes the shepherd Port after pipes and the UDS socket. The Port is linked so it would die with the GenServer anyway, but closing it explicitly makes the cleanup order deterministic. watcher.ex replaced Process.sleep(5000) inside handle_info/2 with a Process.send_after(:escalate_to_sigkill, 5000) message and a new handle_info/2 clause. A Watcher can now still receive messages (e.g. supervisor shutdown) during the SIGTERM grace window instead of being wedged unresponsive.

* Exec.read_uds_message: replace peek-then-recv (which can desync if the first byte arrives ahead of its payload) with opcode-first reads. Read 1 byte for the opcode, then the opcode-specific tail. Timeouts bumped to 500ms. * Exec.spawn_process: validate cmd and args before Port.open. Reject non-binary, empty, or NUL-byte-containing values. Passing these through Port.open's args: is undefined behaviour on the C side. * Exec.cleanup_listener: propagate File.rm errors except :enoent (which is expected when the shepherd already unlinked the socket). * Signal.resolve: reject integer signals outside the POSIX 1..31 range instead of forwarding arbitrary ints to kill(2).

Operations.park previously stored the GenServer.from tuple but did not monitor the caller pid. If a caller timed out its GenServer.call or crashed while parked on EAGAIN, its entry stayed in pending until the Process exited; finish_exit then replied to a dead pid. For a long- running process with many client timeouts this leaked memory. Operations now takes Process.monitor(caller_pid) in park/4 and tracks a second map (monitors: caller_monitor_ref → op_ref). pop/2 demonitors the caller. New pop_by_monitor/2 lets the GenServer remove an entry when it observes the caller's :DOWN. reply_all/2 flushes all monitors. The Process GenServer's :DOWN handler now dispatches: if the monitor ref matches owner_ref (the stream consumer) it SIGKILLs and stops, else it tries Operations.pop_by_monitor to prune a dead parked caller.

Two defects in NetRunner.Daemon: * The generic handle_info({:DOWN, ...}) clause matched any :DOWN and dropped it. A drain Task crashing (abnormal reason) looked identical to normal EOF, so a failing drain stopped invisibly and output thereafter piled up in the pipe until blocking the child process. Narrow the handler to refs we recognise (drain_ref / stderr_drain_ref) and log a Logger.warning when reason != :normal. * drain_loop relied on reader.() never raising. If the Proc GenServer terminated mid-call or on_output blew up on a non-UTF-8 chunk, the linked drain Task crashed, which propagated to the Daemon through Task.async's link and crashed the whole daemon. Wrap drain_loop in try/rescue/catch and isolate on_output in its own rescue so neither path can take down the Daemon.

Signal.resolve previously maintained an Elixir @signals allow-list that duplicated the strcmp list inside nif_signal_number in C. Adding or renaming a signal meant editing two places and any drift produced a silent "unknown_signal" even when the other side supported it. Delegate to the NIF for any atom input; integer-range validation stays in Elixir. The NIF's existing {:error, :unknown_signal} return surfaces unsupported atoms unchanged.

Enable \`SANITIZE=1 make all\` (or the \`make asan\` convenience target) to rebuild the NIF and shepherd with AddressSanitizer + UBSan. When SANITIZE=1 the build switches to -O1 and drops _FORTIFY_SOURCE (they conflict with ASan's memcpy interceptors). Add a new CI job \`sanitizers\` that runs the full Elixir test suite with LD_PRELOAD=libasan so the NIF and shepherd catch memory-safety and undefined-behaviour bugs in the C layer on every push. Gate the publish job on the sanitizers job passing.

The earlier change that parked the caller after a single partial write broke large-buffer streaming. Parking skipped the NIF's EAGAIN path, which is the only place enif_select is armed for :ready_output; the parked caller then hung forever because no readiness signal would arrive. A 100 KB cat stream test reliably timed out after 60s. Restore the original write_loop shape (recurse until the NIF returns either :eagain or completion) so enif_select stays in charge of readiness. Keep the {:ok, 0} guard from the original review fix — if the kernel ever returns 0 on a non-empty write we sleep(1) and retry instead of spinning the dirty scheduler. retry_write_loop mirrors the same shape. Also add a test_helper.exs hook that purges stale /tmp/net_runner_*.sock files before and after every test run; stops accumulation from tests that crash before cleanup_listener runs.

Runs \`printf 'a\0b\0c'\` through the shepherd and asserts the full 5 bytes survive the read path. Guards against any accidental use of String functions on the output binary that would truncate at the first NUL.

Credo strict flagged the cond/true as a refactoring opportunity; with only two branches an if/else is clearer and identical in behaviour.

Public-API fix: NetRunner.run pattern-matched {:ok, pid} on Proc.start, which raised MatchError for any {:error, reason} — including the new validation errors from E12. Return the error tuple directly. New regression tests: * test/signal_test.exs — Signal.resolve range, unknown atoms, bad types, resolve! raising behaviour (E15). * test/process_test.exs — * NUL-byte rejection in cmd/args (E12) * Proc.start with empty cmd rejected (E12) * stderr-only fast-exit command: clean exit + bytes_err stats nonzero (E1) * :owner monitor SIGKILLs OS process when owner crashes (E3) * test/net_runner_test.exs — * run/2 and stream/2 surface validation errors instead of crashing * Timeout path reaps OS processes (no sleep pid leaks after five 100ms timeouts against sleep 30)

Fix CI regression from C9 (treat cgroup setup failure as fatal): The existing Linux-only test assumed CI had write access to /sys/fs/cgroup/; it does not. Before C9 this silently succeeded (child ran outside the cgroup). After C9 the shepherd correctly reports {:error, _}. Rewrite the test to accept either outcome — success on a privileged run, a clean error on an unprivileged one. Either confirms the option is plumbed through. Add a negative test for path validation (absolute paths and traversal are rejected). CHANGELOG: document every fix from the review pass under [Unreleased] — FD leak on mutex failure, use-after-close race, lost stderr, write-loop spin, shepherd framing, sendmsg hardening, post-fork signal safety, waitpid bounds, SIGCHLD reap loop, cgroup/path validation, stream consumer-crash cleanup, Watcher send_after, parked-caller monitors, UDS read redesign, NUL-byte validation, run/2 error surface, File.rm errors, Signal range/source-of-truth, Daemon drain resilience, explicit Port.close, ASan/UBSan build + CI job, stale socket sweep, regression tests. README: new sections on input validation / error returns, binary output with NUL bytes, the Command DSL, an error-handling cheatsheet, and examples for :owner monitoring and per-call kill_timeout.

nyo16 added 20 commits April 16, 2026 19:17

Add binary-with-NUL-bytes round-trip test

f143b0f

Runs \`printf 'a\0b\0c'\` through the shepherd and asserts the full 5 bytes survive the read path. Guards against any accidental use of String functions on the output binary that would truncate at the first NUL.

Replace cond with if/else in Process DOWN dispatch

fa2546d

Credo strict flagged the cond/true as a refactoring opportunity; with only two branches an if/else is clearer and identical in behaviour.

nyo16 merged commit f75659f into master Apr 17, 2026
10 checks passed

nyo16 deleted the review/code-review-fixes branch April 17, 2026 12:01

nyo16 mentioned this pull request Jun 6, 2026

Fix review findings: stderr API, UDS perms, signal range, write loop,… #7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review fixes: close leaks, UAF race, framing, sanitizer CI#4

Review fixes: close leaks, UAF race, framing, sanitizer CI#4
nyo16 merged 20 commits into
masterfrom
review/code-review-fixes

nyo16 commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nyo16 commented Apr 17, 2026

Summary

Critical fixes (correctness)

High-severity fixes (shepherd C layer)

Elixir layer hardening

Tooling

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant