Skip to content

Classify Wasm traps as POSIX signals#680

Closed
brandonpayton wants to merge 2 commits into
Automattic:mainfrom
brandonpayton:emdash/why-segmentation-fault-undb6
Closed

Classify Wasm traps as POSIX signals#680
brandonpayton wants to merge 2 commits into
Automattic:mainfrom
brandonpayton:emdash/why-segmentation-fault-undb6

Conversation

@brandonpayton

@brandonpayton brandonpayton commented Jun 11, 2026

Copy link
Copy Markdown
Member

Purpose

Kandelo currently collapses unexpected Wasm process traps into the generic POSIX-style SIGSEGV/139 result, which makes crashes show up as Segmentation fault even when the trap was arithmetic or illegal control flow. This change makes the host-facing process status more informative while preserving the existing POSIX signal-shaped wait status contract.

The goal is not to claim perfect POSIX provenance from Wasm. It is to use the trap reason that the engine exposes when it is reliable enough, keep SIGSEGV for memory/stack/generic bounds failures, and avoid masking arbitrary user-space unreachable traps as successful process exits.

Summary

  • add shared Wasm trap-to-signal classification for memory/bounds, stack, arithmetic, and illegal control-flow traps
  • propagate classified signal numbers through Node and browser process-worker crash finalization
  • add host integration fixtures/tests and a cross-browser Playwright trap-message classification test

Notes

  • Firefox reports both linear-memory and table bounds traps as RuntimeError: index out of bounds, so the classifier treats generic Wasm bounds traps as SIGSEGV rather than claiming a memory/table distinction.
  • CI currently runs the fork-safe browser smoke workflow only. The broader staging/test workflow is configured as same-repo PR only, so the host kernel-worker test gate does not run for this fork PR.

Tests

  • nix develop -c bash -c 'npm exec vitest -- run test/trap-signals.test.ts test/wasm-trap.test.ts'
  • nix develop -c bash -c 'npm exec playwright -- test wasm-trap-signal.spec.ts --project=chromium --project=firefox --project=webkit'
  • nix develop -c bash -c 'npm run build'

@brandonpayton

Copy link
Copy Markdown
Member Author

Closing in favor of same-repo PR #685 so the staging/test gate can run on the host kernel-worker changes.

brandonpayton added a commit that referenced this pull request Jun 13, 2026
## Purpose

Kandelo was reporting unexpected Wasm process traps through the
POSIX-shaped wait status as generic `SIGSEGV`/`139`, so users saw
`Segmentation fault` even when the engine trap was arithmetic or illegal
control flow. This change makes those crash statuses more informative
while preserving the existing wait-status contract.

The goal is not to claim perfect POSIX provenance from Wasm. It is to
use the trap reason exposed by the engine when it is reliable enough,
keep `SIGSEGV` for memory/stack/generic bounds failures, and avoid
masking arbitrary user-space `unreachable` traps as successful exits.

## Summary

- add shared Wasm trap-to-signal classification for memory/bounds,
stack, arithmetic, and illegal control-flow traps
- propagate classified signal numbers through Node and browser
process-worker crash finalization
- reject non-Wasm executable bytes before worker launch with `ENOEXEC`,
so native host binaries cannot surface as opaque `WebAssembly.compile()`
crashes
- keep guest examples from inheriting the host `PATH`, preventing
`posix_spawnp()` from resolving native host tools such as
`/usr/bin/echo`
- make local Node workers avoid stale ignored
`host/dist/node-kernel-worker-entry.js` artifacts when the TypeScript
source is newer
- cover trap classification in host tests and deterministic
Chromium/Firefox/WebKit browser tests
- make host kernel worker changes trigger browser smoke coverage,
including the trap classifier spec

## Notes

- Firefox reports both linear-memory and table bounds traps as
`RuntimeError: index out of bounds`, so the classifier treats generic
Wasm bounds traps as `SIGSEGV` rather than claiming a memory/table
distinction.
- The staging browser suite runs the full fast app suite in Chromium and
the trap classifier spec in Chromium/Firefox. WebKit trap-classifier
coverage runs in the Browser demo smoke workflow because WebKit is not
reliable inside the Nix staging browser job.
- Replaces fork PR #680 so the same-repo staging/test gate can run.

## Validation

- `nix develop -c bash -c 'cd host && npm exec vitest -- run
test/trap-signals.test.ts test/wasm-trap.test.ts
test/centralized-spawn.test.ts'`
- `nix develop -c bash -c 'scripts/run-libc-tests.sh functional spawn'`
- `nix develop -c bash -c 'cd apps/browser-demos && CI=true
KANDELO_PLAYWRIGHT_PORT=5414 PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 npx
playwright test wasm-trap-signal.spec.ts --project=chromium
--project=firefox --project=webkit --workers=1'`
- `nix develop -c bash -c 'cd host && npm run build'`
- Browser demo smoke tests:
https://github.com/Automattic/kandelo/actions/runs/27442470901
- Staging build:
https://github.com/Automattic/kandelo/actions/runs/27442470918
brandonpayton added a commit that referenced this pull request Jun 13, 2026
## Purpose

Kandelo was reporting unexpected Wasm process traps through the
POSIX-shaped wait status as generic `SIGSEGV`/`139`, so users saw
`Segmentation fault` even when the engine trap was arithmetic or illegal
control flow. This change makes those crash statuses more informative
while preserving the existing wait-status contract.

The goal is not to claim perfect POSIX provenance from Wasm. It is to
use the trap reason exposed by the engine when it is reliable enough,
keep `SIGSEGV` for memory/stack/generic bounds failures, and avoid
masking arbitrary user-space `unreachable` traps as successful exits.

## Summary

- add shared Wasm trap-to-signal classification for memory/bounds,
stack, arithmetic, and illegal control-flow traps
- propagate classified signal numbers through Node and browser
process-worker crash finalization
- reject non-Wasm executable bytes before worker launch with `ENOEXEC`,
so native host binaries cannot surface as opaque `WebAssembly.compile()`
crashes
- keep guest examples from inheriting the host `PATH`, preventing
`posix_spawnp()` from resolving native host tools such as
`/usr/bin/echo`
- make local Node workers avoid stale ignored
`host/dist/node-kernel-worker-entry.js` artifacts when the TypeScript
source is newer
- cover trap classification in host tests and deterministic
Chromium/Firefox/WebKit browser tests
- make host kernel worker changes trigger browser smoke coverage,
including the trap classifier spec

## Notes

- Firefox reports both linear-memory and table bounds traps as
`RuntimeError: index out of bounds`, so the classifier treats generic
Wasm bounds traps as `SIGSEGV` rather than claiming a memory/table
distinction.
- The staging browser suite runs the full fast app suite in Chromium and
the trap classifier spec in Chromium/Firefox. WebKit trap-classifier
coverage runs in the Browser demo smoke workflow because WebKit is not
reliable inside the Nix staging browser job.
- Replaces fork PR #680 so the same-repo staging/test gate can run.

## Validation

- `nix develop -c bash -c 'cd host && npm exec vitest -- run
test/trap-signals.test.ts test/wasm-trap.test.ts
test/centralized-spawn.test.ts'`
- `nix develop -c bash -c 'scripts/run-libc-tests.sh functional spawn'`
- `nix develop -c bash -c 'cd apps/browser-demos && CI=true
KANDELO_PLAYWRIGHT_PORT=5414 PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 npx
playwright test wasm-trap-signal.spec.ts --project=chromium
--project=firefox --project=webkit --workers=1'`
- `nix develop -c bash -c 'cd host && npm run build'`
- Browser demo smoke tests:
https://github.com/Automattic/kandelo/actions/runs/27442470901
- Staging build:
https://github.com/Automattic/kandelo/actions/runs/27442470918
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant