Skip to content

security(github-copilot): warn and optionally refuse default token password on shared FS#302

Open
pjdoland wants to merge 2 commits into
plmbr:mainfrom
pjdoland:fix/refuse-default-password-shared-fs
Open

security(github-copilot): warn and optionally refuse default token password on shared FS#302
pjdoland wants to merge 2 commits into
plmbr:mainfrom
pjdoland:fix/refuse-default-password-shared-fs

Conversation

@pjdoland
Copy link
Copy Markdown
Collaborator

@pjdoland pjdoland commented May 18, 2026

Summary

The default NBI_GH_ACCESS_TOKEN_PASSWORD is a public literal. On a shared-home cluster, anyone with read access to another user's ~/.jupyter/nbi/user-data.json can decrypt their Copilot OAuth token using the documented default. admin-guide.md and PRIVACY.md already call this out, but the code did nothing programmatic to prevent it: no startup warning, no refusal-to-write, no FS-mode probe.

Solution

Add two opt-in guardrails on the token write path:

Always-on WARNING when the default password is in use. Fires on the first read or write of the stored token (per process), so an operator sees it in the boot log of any session that touches the file. The message names the env var and the documented escape hatches. When the directory that holds user-data.json is readable or writable by group or other (S_IRGRP/S_IROTH/S_IWGRP/S_IWOTH), the message escalates and names the directory path so a log triage finds the file.

Opt-in refusal. Set NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS=1 and the write is refused entirely when default password AND group/other-accessible directory both hold. The refusal is opt-in so single-user deployments with incidentally-loose home modes (the default umask 0o022 shape) keep working. Admins who knowingly accept the risk during a rollout, or in single-user contexts where the dir mode is incidental, opt back in per pod with NBI_ALLOW_DEFAULT_TOKEN_PASSWORD=1.

The shared-dir check is POSIX-only; on Windows the gate is a no-op because S_IRGRP/S_IROTH don't map onto NTFS ACL semantics. Env-var parsing reuses the same vocab as _resolve_bool_with_env (true/1/yes/on, case-insensitive) so admins don't have to remember a different set of accepted values for this knob.

The write-bits (S_IWGRP/S_IWOTH) count too: a writable parent dir lets a neighbor plant a forged user-data.json so the next session reads the attacker's token, which is the same shape of damage as a readable parent leaking the existing token. The audit specifies "world-readable" but flagging write is cheap defense in depth.

Testing

tests/test_default_password_refusal.py covers:

  • Warn fires on the read path AND the write path. The read-path test now also asserts the token IS still returned (a regression that aborts the read on warn would silently log the user out).
  • Warn is one-shot per process (subsequent calls don't add log lines).
  • Warn does not fire when the password is overridden.
  • Escalated warn includes the directory path so an operator triaging logs can find the file. The non-shared branch omits the directory path so log analysis can distinguish the two postures by message content alone.
  • Refusal opt-in matrix: default-off (no refuse), opt-in shared (refuse), opt-in private (no refuse), opt-in + opt-out (no refuse), opt-in + non-default password (no refuse).
  • End-to-end: refused write returns False, the file is not created, the error log names the file.
  • End-to-end happy path: write succeeds when refusal not engaged.
  • POSIX-only tests are skipped on Windows.

Full suites: pytest 773 pass, tsc clean, prettier clean, jest clean.

Risks / follow-ups

  • Behavior change: nothing changes by default. Operators must set NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS=1 to engage the refusal. The always-on WARNING is new noise for any deployment running with the default password, but only one line per process; a deployment that already sets a per-user password sees no new logs.
  • Silent refusal UX: when the gate refuses a write, the user sees the GitHub Copilot login succeed (the in-memory session is fine) but "remember login" silently fails. The server log is the only signal. Tracked as a separate follow-up to surface this in the chat UI via a token_persisted: false field on the login-status event; this PR is opt-in (the admin who sets the env understands the tradeoff) so in-app feedback is a UX enhancement rather than a regression blocker.
  • Default-deny via shared-FS heuristic: an alternative would be to default-deny on detected NFS mounts / JUPYTERHUB_USER presence, so cluster deployments get the protection without an extra env. Out of scope here; the opt-in flag is the conservative first step.
  • Companion file-mode hardening: enforcing 0o600 on the user-data.json file (not just the directory) is tracked as a separate audit item. With that mode, the directory check becomes belt-and-suspenders; without it the dir check is the primary defense.

@pjdoland pjdoland added the bug Something isn't working label May 18, 2026
@pjdoland pjdoland changed the title fix(github-copilot): warn and optionally refuse default token password on shared FS security(github-copilot): warn and optionally refuse default token password on shared FS May 18, 2026
pjdoland added 2 commits May 19, 2026 06:10
…d on shared FS

The default NBI_GH_ACCESS_TOKEN_PASSWORD is a public literal. On a
shared-home cluster (NFS-backed JupyterHub, classroom HPC), anyone with
read access to another user's ~/.jupyter/nbi/user-data.json can decrypt
their Copilot OAuth token. The admin-guide and PRIVACY.md call this out
but the code did nothing programmatic to prevent it: no startup warning,
no refusal-to-write, no FS-mode probe.

Add two opt-in guardrails on the token write path:

1. A per-process WARNING fires on the first read or write of the stored
   token when the default password is in use. The message names the
   env var and the documented escape hatches. When the directory that
   holds user-data.json is readable or writable by group or other
   (S_IRGRP/S_IROTH/S_IWGRP/S_IWOTH), the warning escalates to call out
   the directory path so an operator triaging logs can find the file.

2. When NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS is set, the
   write is refused entirely under the same conditions: default
   password AND a group/other-accessible directory. The refusal is
   opt-in so single-user deployments with incidentally-loose home
   modes don't break. Admins who knowingly accept the risk during a
   rollout can opt back in per pod with NBI_ALLOW_DEFAULT_TOKEN_PASSWORD.

The shared-dir check is POSIX-only (Windows ACL semantics don't map
onto S_IRGRP/S_IROTH); on Windows the gate is a no-op so existing
deployments are unaffected. Env-var parsing matches the
_resolve_bool_with_env vocab (true/1/yes/on, case-insensitive) so
admins don't have to remember a different set of accepted values for
this knob.

Test coverage in tests/test_default_password_refusal.py covers warn-on
both read and write, warn-once-per-process, the escalated message
including the directory path, the refusal opt-in matrix
(default-off / opt-in shared / opt-in private / opt-in + opt-out /
non-default password), and an end-to-end write that returns False
with a named error message naming the file.

Admin-guide and PRIVACY notes updated to document both new env vars.
Second-pass review surfaced two items:

1. The default-password WARNING mentioned NBI_POD_IDENTITY as a knob
   the operator could set, but that env var ships in PR plmbr#300 (per-pod
   KDF) and does not exist on the branch's merge base. Drop the
   reference until plmbr#300 lands; the remaining knobs
   (NBI_GH_ACCESS_TOKEN_PASSWORD per-user, NBI_ALLOW_DEFAULT_TOKEN_PASSWORD
   opt-out) are sufficient guidance for the deployments this PR
   targets.

2. The read-path warn test only asserted the WARNING fired; the
   contract is "warn AND return the plaintext" because a regression
   that aborts the read on warn would silently log the user out.
   Rewritten to encrypt a real token, run the read, and assert both
   halves.

Also pin that the non-shared branch of the warning omits the
directory path so log analysis can tell the two postures apart by
message content alone.

The silent-fail UX (when NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS
refuses the write, the user sees no in-app signal) is documented as a
known limitation in the PR description but not addressed here:
plumbing storage-failure status back to the chat sidebar is a
separable change that touches the capability handshake, and the
audit's recommended remediation (operators set the env vars before
rollout) means in-app feedback is a UX enhancement rather than a
regression blocker.
@pjdoland pjdoland force-pushed the fix/refuse-default-password-shared-fs branch from 709847d to 34d1f53 Compare May 19, 2026 10:10
@pjdoland pjdoland added this to the 5.1.x milestone May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants