nixos-tests: disable SQLite WAL to prevent SIGBUS in CI by amaanq · Pull Request #1615 · NixOS/hydra

amaanq · 2026-03-30T02:08:43Z

Problem

SQLite in WAL mode mmaps a shared memory file that can fault under concurrent
access, which kills nix with SIGBUS.

Solution

Disabling WAL eliminates the shared memory entirely, which will never allow for the fault to occur anymore.

Additional Context

This was the cause for spurious CI failures we've been seeing recently (1 2)

I spammed CI in my fork with a sigbus handler to actually find the root cause of this....see here :) https://github.com/amaanq/hydra/actions/runs/23701271798/job/69045258619?pr=1#step:5:8309

Relevant snippet

( STDERR )  job 56    *** SIGBUS (Bus error) at address 0x0000fffff75f4000
( STDERR )  job 56    hydra-queue-runner(+0x39f434) [0xaaaaaae3f434]
( STDERR )  job 56    linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0xfffff7ffa850]
( STDERR )  job 56    /nix/store/2z8w3q6z3yskyj2ng3bga5h7x30sxdab-glibc-2.40-66/lib/libc.so.6(+0xaedb8) [0xfffff73dedb8]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(+0x8251c) [0xfffff707251c]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(+0x828fc) [0xfffff70728fc]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(+0xae6b4) [0xfffff709e6b4]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(+0xaf338) [0xfffff709f338]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(+0xeb5c4) [0xfffff70db5c4]
( STDERR )  job 56    /nix/store/1b3almilx7nyrqacfrgnsmayja1dyrdc-sqlite-3.50.4/lib/libsqlite3.so(sqlite3_step+0x29c) [0xfffff70e18bc]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix10SQLiteStmt3Use4nextEv+0x34) [0xfffff7c934b4]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix10LocalStore21queryPathInfoInternalERNS0_5StateERKNS_9StorePathE+0xb8) [0xfffff7c2c998]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix10LocalStore21queryPathInfoUncachedERKNS_9StorePathENS_8CallbackISt10shared_ptrIKNS_13ValidPathInfoEEEE+0x90) [0xfffff7c2d110]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix5Store13queryPathInfoERKNS_9StorePathENS_8CallbackINS_3refIKNS_13ValidPathInfoEEEEE+0x3a8) [0xfffff7ca6038]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix5Store13queryPathInfoERKNS_9StorePathE+0x110) [0xfffff7ca6490]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(+0x225d00) [0xfffff7c45d00]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix5Store13topoSortPathsERKSt3setINS_9StorePathESt4lessIS2_ESaIS2_EE+0x130) [0xfffff7c3b030]
( STDERR )  job 56    /nix/store/m016npyg40c9qin11s0qwhv1rzibvgv7-nix-store-2.34.1/lib/libnixstore.so.2.34.1(_ZN3nix9copyPathsERNS_5StoreES1_RKSt3setINS_9StorePathESt4lessIS3_ESaIS3_EENS_10RepairFlagENS_13CheckSigsFlagENS_14SubstituteFlagE+0x1b0) [0xfffff7cab970]
( STDERR )  job 56    hydra-queue-runner(+0xcec488) [0xaaaaab78c488]
( STDERR )  job 56    hydra-queue-runner(+0xce22cc) [0xaaaaab7822cc]
( STDERR )  job 56    hydra-queue-runner(+0xce16ac) [0xaaaaab7816ac]
( STDERR )  job 56    hydra-queue-runner(+0x4a7098) [0xaaaaaaf47098]
( STDERR )  job 56    hydra-queue-runner(+0x38ef38) [0xaaaaaae2ef38]
( STDERR )  job 56    hydra-queue-runner(+0x317864) [0xaaaaaadb7864]
( STDERR )  job 56    hydra-queue-runner(+0xd4add8) [0xaaaaab7eadd8]
( STDERR )  job 56    hydra-queue-runner(+0xd6b4cc) [0xaaaaab80b4cc]
( STDERR )  job 56    hydra-queue-runner(+0xd6bdb4) [0xaaaaab80bdb4]
( STDERR )  job 56    hydra-queue-runner(+0xe2289c) [0xaaaaab8c289c]
( STDERR )  job 56    /nix/store/2z8w3q6z3yskyj2ng3bga5h7x30sxdab-glibc-2.40-66/lib/libc.so.6(+0x901ec) [0xfffff73c01ec]
( STDERR )  job 56    /nix/store/2z8w3q6z3yskyj2ng3bga5h7x30sxdab-glibc-2.40-66/lib/libc.so.6(+0x10034c) [0xfffff743034c]
( STDERR )  job 56    2026-03-29T04:30:45.539314Z ERROR start_bidirectional_stream: hydra_builder::grpc: stream message delivery failed: code: 'Unknown error', message: "h2 protocol error: error reading a body from connection", source: hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })
( STDERR )  job 56    2026-03-29T04:30:45.539342Z ERROR start_bidirectional_stream: hydra_builder::grpc: stream message delivery failed: code: 'Unknown error', message: "h2 protocol error: error reading a body from connection", source: hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })
( STDERR )  job 56    2026-03-29T04:30:45.563882Z ERROR process_build: hydra_builder::state: error=Import failure: `code: 'The service is currently unavailable', message: "tcp connect error", source: tonic::transport::Error(Transport, ConnectError(ConnectError("tcp connect error", [::1]:7001, Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })))` timings=BuildTimings { import_elapsed: 0ns, build_elapsed: 0ns, upload_elapsed: 0ns } drv=zym9dr516lin3f31z2kmy6pkgj0a96ka-out-is-directory.drv
( STDERR )  job 56    2026-03-29T04:30:45.563936Z ERROR hydra_builder::state: Build of zym9dr516lin3f31z2kmy6pkgj0a96ka-out-is-directory.drv failed with Import failure: `code: 'The service is currently unavailable', message: "tcp connect error", source: tonic::transport::Error(Transport, ConnectError(ConnectError("tcp connect error", [::1]:7001, Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })))`
( STDERR )  job 56    2026-03-29T04:30:45.563967Z ERROR hydra_builder::state: Failed to submit build failure info: err=code: 'Unknown error', message: "Service was not ready: transport error", retrying in=4.850542545s

Note that it is impossible to debug these without such a handler, but I don't think this should be installed by default here.

Alternatives Considered

I'd considered setting exclusive locking mode for sqlite in Nix upstream via PRAGMA locking_mode = EXCLUSIVE, as it'll keep the WAL-index in heap memory rather than in a mmapped shared memory file, but this has a huge downside of each process locking the database as the index cannot be shared across multiple processes. In practice I doubt it would have much of an effect as I imagine most users aren't running many concurrent Nix processes that need database access, .but in Hydra we ran into this SIGBUS due to our concurrent VM tests spawning multiple nix processes.

I'm not too particularly happy with this solution, I'd also thought of maybe just increasing the VM's memory to 2GB but that isn't guaranteed to fix it depending on whether the cause for the faults was the file being truncated or the kernel evicting the backing page.

SQLite in WAL mode mmaps a shared memory file that can fault under concurrent access, which kills nix with SIGBUS. This has been causing spurious CI failures with no useful logs occasioanlly. Disabling WAL eliminates the shared memory file entirely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nixos-tests: disable SQLite WAL to prevent SIGBUS in CI#1615

nixos-tests: disable SQLite WAL to prevent SIGBUS in CI#1615
amaanq wants to merge 1 commit intoNixOS:masterfrom
obsidiansystems:sigbus-fix

amaanq commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

amaanq commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Additional Context

Alternatives Considered

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

amaanq commented Mar 30, 2026 •

edited

Loading