Skip to content

fix(link): statically link BLAS/LAPACK/gfortran into the engine binary#20

Merged
ccomb merged 3 commits intomainfrom
fix/static-link-blas-lapack
May 3, 2026
Merged

fix(link): statically link BLAS/LAPACK/gfortran into the engine binary#20
ccomb merged 3 commits intomainfrom
fix/static-link-blas-lapack

Conversation

@ccomb
Copy link
Copy Markdown
Owner

@ccomb ccomb commented May 2, 2026

Companion to #19 (warn-on-missing-runtime-libs) — once this lands, the
warning becomes mostly cosmetic for the install.sh path: a fresh
Debian-slim container can run `volca --version` without first
`apt install`ing anything.

Why

The existing `static` LINK_MODE only statics MUMPS. BLAS, LAPACK,
libgfortran and libquadmath were placed after `-Wl,-Bdynamic`,
so they stayed dynamic. End-users hit
`liblapack.so.3: cannot open shared object file` on minimal Linux
images (Debian-slim containers, Alpine, Fedora cloud) that don't
ship the numeric stack.

Fix

Move BLAS/LAPACK/gfortran/quadmath inside the same
`--start-group/--end-group` as the MUMPS libs. The linker iterates
until the cross-references settle — necessary because
libgfortran ↔ libgcc, libopenblas/lapack ↔ libgfortran, and MUMPS
↔ BLAS/LAPACK all need each other's symbols.

Only libc/libm/libpthread/libdl stay dynamic — system ABI, not worth
carrying twice.

`darwin` and `windows` modes already do their own thing (ld64
fallback for darwin; MinGW for windows). Linux is the only
LINK_MODE this PR touches.

Tradeoffs

  • ~30 MB larger binary. Engine is already ~70 MB so this is a third
    bigger, not a doubling.
  • Loses ability to swap a CVE'd OpenBLAS via `apt upgrade` — we
    ship a new release instead. Acceptable for now (no production
    consumers yet).
  • License: libgfortran has the GCC Runtime Library Exception,
    OpenBLAS + LAPACK are BSD-3 — all compatible with our Apache-2.0.

Test plan

  • CI 4-platform build green; Linux binary size ~30 MB larger
    than before.
  • On a fresh Debian-slim container with no BLAS/LAPACK/Fortran:
    `./install.sh && volca --version` prints the version, no
    loader errors. (The exact reproducer that motivated fix(install): warn when BLAS/LAPACK runtime libs are missing #19.)
  • Functional check: the binary still solves an LCA workload
    end-to-end (`volca compute` against a small fixture).
  • `ldd volca` shows only libc/libm/libpthread/libdl as
    external deps.

@ccomb ccomb force-pushed the fix/static-link-blas-lapack branch 3 times, most recently from 581aae5 to 2394050 Compare May 2, 2026 23:12
ccomb added 2 commits May 3, 2026 01:23
Published Linux engine dynamic-links liblapack.so.3, libblas.so.3,
libgfortran.so.5, libquadmath.so.0, libopenblas.so.0 — minimal cloud
images don't ship those, so install.sh succeeds but `volca --version`
then fails with "cannot open shared object file".

Switch the `static` LINK_MODE to a real static link:

- `executable-static: True` + `shared: False` make cabal build every
  dep as `.a` and ask GHC for `-static`. With no intermediate
  libHS*.so, non-PIC libgfortran.a never gets pulled into a shared
  object — that's what tripped R_X86_64_TPOFF32 in earlier attempts.
- `-no-pie` keeps Debian's non-PIC system Fortran/BLAS archives
  linkable into the final non-PIE executable.
- A single `--start-group`/`--end-group` resolves the cyclic refs
  between MUMPS / LAPACK / BLAS / libgfortran / libquadmath.

cbits/static-shims.c provides __xmknod / __xmknodat: the unix-2.8.6.0
archive shipped with ghcup's GHC 9.6 was compiled against pre-2.32
glibc headers that inlined mknod()/mknodat() to those internal
symbols, removed from the public ABI in glibc 2.32.

Verified: `file` → "statically linked"; `ldd` → "not a dynamic
executable"; `volca --version` runs in debian:stable-slim and
alpine:latest containers with no system BLAS/LAPACK/Fortran present.
Replace `throwTo mainTid ExitSuccess` in /api/v1/shutdown and the
idle-watchdog with a direct `_exit(0)` via FFI. The throwTo path
exits the program by unwinding to main and letting the RTS run
its full shutdown sequence (pthread_cancel of capability worker
threads, which glibc resolves via `dlopen("libgcc_s.so.1")` for
the unwinder). On a fully-static glibc build that dlopen returns
NULL — the shutdown path then SIGILLs on the missing unwinder
symbol. Reproduces on linux-arm64; linux-amd64 mostly gets away
with it. _exit(0) is async-signal-safe and bypasses the entire
RTS cleanup.

Drop the now-unused `mainTid` parameter from shutdownEndpoint and
idleWatchdog. hFlush stdout/stderr before _exit so buffered log
lines (incl. the "Shutdown requested" / "Idle for Ns" messages)
still reach the user.
@ccomb ccomb force-pushed the fix/static-link-blas-lapack branch from 2394050 to 2f428f6 Compare May 2, 2026 23:24
The build-mode upload glob `…/build/volca/volca*` also matched the
sibling `volca-tmp/` directory (cabal's intermediate build dir),
so PR preview artefacts shipped Main.hi and Main.o alongside the
binary. Spell out the two valid file names (`volca` on Posix,
`volca.exe` on Windows) instead of trusting the trailing `*`.

Release-mode tarballs are unaffected — they're built from a
staging dir that only contains the binary.
@ccomb ccomb merged commit 0d7d772 into main May 3, 2026
4 checks passed
@ccomb ccomb deleted the fix/static-link-blas-lapack branch May 3, 2026 15:34
ccomb added a commit that referenced this pull request May 3, 2026
## Summary

- `docker/Dockerfile` failed at the `gen-cabal-config.sh` step with
`cc1: fatal error: /build/volca/cbits/static-shims.c: No such file or
directory`.
- Root cause: PR #20 made `gen-cabal-config.sh` compile
`cbits/static-shims.c` to a `.o` consumed by the static link, but the
Dockerfile only copies `volca.cabal`, `mumps-hs/` and
`gen-cabal-config.sh` into the builder before invoking the script.
- Fix: add `COPY cbits/ /build/volca/cbits/` next to the other dep-spec
copies, so the shim source is present when the script runs and layer
caching only invalidates when shim sources change.

## Test plan

- [x] `./volca/docker-build.sh` (against this branch) progresses past
step 11 — confirmed locally up to `cabal build --only-dependencies
volca`.
- [ ] Full image build completes and `docker run volca --version` works.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant