Skip to content

Fall back to signalfd+SIGCHLD when pidfd_open returns EPERM#160

Open
paddor wants to merge 2 commits intosocketry:mainfrom
paddor:signalfd-fallback-for-process-wait
Open

Fall back to signalfd+SIGCHLD when pidfd_open returns EPERM#160
paddor wants to merge 2 commits intosocketry:mainfrom
paddor:signalfd-fallback-for-process-wait

Conversation

@paddor
Copy link
Copy Markdown

@paddor paddor commented Mar 15, 2026

Summary

  • When pidfd_open fails with EPERM (e.g. inside snap confinement), the epoll and uring selectors now fall back to a signalfd watching SIGCHLD instead of raising
  • Since SIGCHLD fires for any child, the fallback loops until the specific target pid exits
  • Guarded by #ifdef HAVE_SYS_SIGNALFD_H; on systems without signalfd, behavior is unchanged (raises on EPERM as before)

Context

This is not urgent — snapd 2.75+ already allows pidfd_open in its default seccomp profile, so the confinement issue is resolved for most users. However, older snapd versions (and potentially other seccomp-confined environments) still block the syscall. Feel free to close this if you'd rather not carry the extra code path.

Test plan

  • Existing test suite passes (201 passed, 2 skipped)
  • New tests use seccomp-BPF (via Fiddle) in a forked child to block pidfd_open with EPERM, then exercise #process_wait through the signalfd path
  • Tests cover: already-exited process, still-running process, two sequential waits

Inside snap confinement (pre-snapd 2.75), the seccomp profile blocks
pidfd_open with EPERM. When this happens, the epoll and uring selectors
now fall back to a signalfd watching SIGCHLD:

- Block SIGCHLD via pthread_sigmask
- Create a signalfd (SFD_CLOEXEC | SFD_NONBLOCK)
- Register it with epoll/uring for readability
- Loop: yield → drain signalfd → waitpid(WNOHANG) for the target pid
- Cleanup: close signalfd, restore signal mask via rb_ensure

Since SIGCHLD fires for any child process, the fallback loops until the
specific target pid has exited.

Tests use seccomp-BPF (via Fiddle) in a forked child to block pidfd_open
and exercise the fallback path.
@ioquatix
Copy link
Copy Markdown
Member

For the reasons you stated, I'm slightly against merging this. But if we could make the burden a little lighter the calculus might change.

Do you think we can extract the logic into a shared code file, so that we don't duplicate this between epoll and uring backends?

I know I deliberately avoided sharing a lot of code in the backends but it's mostly the mechanics of the scheduler, and I was aiming to have independent implementations where the logic was significantly tied to the OS interface. But for this, I think it's the same, right? So we could probably share the implementation and just have if (errno == EPERM) fallback_entry_point(...);

WDYT?

Move the signalfd lifecycle (open/check/close) into a shared file that
is #include'd by both epoll.c and uring.c, like pidfd.c. Each backend
keeps only its selector-specific registration and yield loop.
@paddor
Copy link
Copy Markdown
Author

paddor commented Mar 16, 2026

Good idea — done in b2098b9. I extracted the signalfd lifecycle into process_wait_signalfd.c (included like pidfd.c), which provides three helpers:

  • process_wait_signalfd_open — blocks SIGCHLD, creates signalfd, handles early exit
  • process_wait_signalfd_check — drains signalfd, calls waitpid for the target pid
  • process_wait_signalfd_close — closes signalfd, restores signal mask

Each backend keeps only its selector-specific registration loop (epoll_ctl vs io_uring_prep_poll_add). The EPERM fallback in each process_wait is now ~15 lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants