Skip to content

fix(sessionArchiveStore): use stream-based copy for virtiofs/NFS compatibility#9

Open
ductiletoaster wants to merge 2 commits into
jmfederico:mainfrom
ductiletoaster:fix/copyfile-virtiofs-eperm
Open

fix(sessionArchiveStore): use stream-based copy for virtiofs/NFS compatibility#9
ductiletoaster wants to merge 2 commits into
jmfederico:mainfrom
ductiletoaster:fix/copyfile-virtiofs-eperm

Conversation

@ductiletoaster

Copy link
Copy Markdown

Summary

fs.copyFile (used in copySessionFileToArchive and as the EXDEV fallback in moveFile) invokes Linux's copy_file_range(2) syscall. That syscall fails with EPERM on some filesystems — notably virtiofs (used by Kata Containers) when the underlying export is NFS, and on certain NFS server configurations directly. Node only falls back to read+write on EOPNOTSUPP / EXDEV, not EPERM, so the error surfaces to the user and archive operations fail with:

Error: EPERM: operation not permitted, copyfile
  '/home/node/.pi/agent/sessions/<project>/<id>.jsonl'
  -> '/home/node/.pi-web/archived-sessions/<id>.jsonl'

Reproducer

On a Kubernetes Kata Containers pod where /home/node is mounted from an NFS-backed PVC via Kata's virtiofs daemon:

const fs = require("node:fs/promises");
await fs.copyFile(srcPath, dstPath);   // throws EPERM

vs. (works on the same paths):

const fs = require("node:fs");
const { pipeline } = require("node:stream/promises");
await pipeline(fs.createReadStream(srcPath), fs.createWriteStream(dstPath));   // OK

Same behaviour reportedly hits on at least some plain NFSv4.1 server configurations where the server returns EPERM for copy_file_range.

Change

Replace copyFile(src, dest) at both call sites in sessionArchiveStore.ts with a small streamCopyFile helper that pipes createReadStream(src) into createWriteStream(dest) via stream/promises.pipeline. This forces the userspace read+write path and works across virtiofs, NFS, and local filesystems alike.

Throughput is slightly lower than the kernel-level copy on fast local disks, but archived session JSONL files are small — the difference is unobservable in practice.

Test plan

  • Existing sessionArchiveStore.test.ts should still pass (no behaviour change for happy-path local copies)
  • Verified on a real Kata + NFS-backed PVC pod: archive flow that previously failed with EPERM now succeeds

Happy to add a unit test that mocks fs.copyFile to throw EPERM and verifies the stream-based helper isn't affected, if you'd like.

…atibility

`fs.copyFile` (used in `copySessionFileToArchive` and as the `EXDEV`
fallback in `moveFile`) invokes Linux's `copy_file_range(2)` syscall.
That syscall fails with `EPERM` on some filesystems — notably virtiofs
(used by Kata Containers) when the underlying export is NFS, and on
certain NFS server configurations directly. Node only falls back to
read+write on `EOPNOTSUPP` / `EXDEV`, not `EPERM`, so the error
surfaces to the caller and archive operations fail with:

  Error: EPERM: operation not permitted, copyfile
    '/home/node/.pi/agent/sessions/<project>/<id>.jsonl'
    -> '/home/node/.pi-web/archived-sessions/<id>.jsonl'

Replace `copyFile(src, dest)` at both call sites with a small
`streamCopyFile` helper that pipes `createReadStream(src)` into
`createWriteStream(dest)` via `stream/promises.pipeline`. This forces
the userspace read+write path and works across virtiofs, NFS, and
local filesystems alike.

Throughput is slightly lower than the kernel-level copy on fast local
disks, but archived session files are small JSONL records — the
difference is unobservable in practice.

Reproduced on a Kubernetes Kata pod where `/home/node` is mounted
from an NFS-backed PVC via Kata's virtiofs daemon; `fs.copyFile`
returns EPERM, but `pipeline(createReadStream(...), createWriteStream(...))`
succeeds.
@jmfederico

Copy link
Copy Markdown
Owner

This approach looks reasonable to me.

I tried a small follow-up on top of this PR that keeps the stream-copy approach but makes it safer: it writes to a temp file in the destination directory, then renames it into place on success. That way, if the stream copy fails, we don’t leave a partial archive file at the final path. The same-directory rename() should be metadata-only and should not hit the copy_file_range(2) path that causes the EPERM.

Could you try this variant in the Kata/virtiofs/NFS environment where you reproduced the issue?
pr-9-atomic-stream-copy

Harmony pins the pi-web image to this fork branch via PI_WEB_FORK_REF
(harmony Dockerfile). With auto: true the theme resolver follows
prefers-color-scheme and renders light when the browser/OS reports
light preference — undesired for the Harmony lab UI where dark is the
intended look across all devices/browsers.

Setting auto: false keeps pi-web-dark active regardless of the system
preference. Users who want auto/light can flip the toggle in the
theme picker; the change only affects the first-paint default.

Fork-only — not intended upstream (upstream default of `auto: true` is
the right call for a public app).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants