security(github-copilot): per-pod KDF for stored Copilot token#300
Open
pjdoland wants to merge 1 commit into
Open
security(github-copilot): per-pod KDF for stored Copilot token#300pjdoland wants to merge 1 commit into
pjdoland wants to merge 1 commit into
Conversation
pjdoland
added a commit
to pjdoland/notebook-intelligence
that referenced
this pull request
May 18, 2026
Second-pass review surfaced two items: 1. The default-password WARNING mentioned NBI_POD_IDENTITY as a knob the operator could set, but that env var ships in PR plmbr#300 (per-pod KDF) and does not exist on the branch's merge base. Drop the reference until plmbr#300 lands; the remaining knobs (NBI_GH_ACCESS_TOKEN_PASSWORD per-user, NBI_ALLOW_DEFAULT_TOKEN_PASSWORD opt-out) are sufficient guidance for the deployments this PR targets. 2. The read-path warn test only asserted the WARNING fired; the contract is "warn AND return the plaintext" because a regression that aborts the read on warn would silently log the user out. Rewritten to encrypt a real token, run the read, and assert both halves. Also pin that the non-shared branch of the warning omits the directory path so log analysis can tell the two postures apart by message content alone. The silent-fail UX (when NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS refuses the write, the user sees no in-app signal) is documented as a known limitation in the PR description but not addressed here: plumbing storage-failure status back to the chat sidebar is a separable change that touches the capability handshake, and the audit's recommended remediation (operators set the env vars before rollout) means in-app feedback is a UX enhancement rather than a regression blocker.
Collaborator
|
@pjdoland this looks great but can we allow to turn off this new logic and use the existing method. In some environments filesystem is not shared and reboots change the machine id which would require a re-login. |
Collaborator
Author
|
Definitely. That is almost certainly the correct approach...as it might be for some of these other security PRS. |
pjdoland
added a commit
to pjdoland/notebook-intelligence
that referenced
this pull request
May 19, 2026
Second-pass review surfaced two items: 1. The default-password WARNING mentioned NBI_POD_IDENTITY as a knob the operator could set, but that env var ships in PR plmbr#300 (per-pod KDF) and does not exist on the branch's merge base. Drop the reference until plmbr#300 lands; the remaining knobs (NBI_GH_ACCESS_TOKEN_PASSWORD per-user, NBI_ALLOW_DEFAULT_TOKEN_PASSWORD opt-out) are sufficient guidance for the deployments this PR targets. 2. The read-path warn test only asserted the WARNING fired; the contract is "warn AND return the plaintext" because a regression that aborts the read on warn would silently log the user out. Rewritten to encrypt a real token, run the read, and assert both halves. Also pin that the non-shared branch of the warning omits the directory path so log analysis can tell the two postures apart by message content alone. The silent-fail UX (when NBI_REFUSE_DEFAULT_TOKEN_PASSWORD_ON_SHARED_FS refuses the write, the user sees no in-app signal) is documented as a known limitation in the PR description but not addressed here: plumbing storage-failure status back to the chat sidebar is a separable change that touches the capability handshake, and the audit's recommended remediation (operators set the env vars before rollout) means in-app feedback is a UX enhancement rather than a regression blocker.
The encrypted token at ~/.jupyter/nbi/user-data.json was keyed via
PBKDF2 from a public default password (NBI_GH_ACCESS_TOKEN_PASSWORD,
default literal "nbi-access-token-password"). Every install with the
default shipped the same key, so an attacker who read user-data.json
off a shared filesystem could decrypt the OAuth access token directly,
no brute force.
Mix a per-pod + per-user secret into the KDF password input before
PBKDF2. Resolution order (first non-empty wins, the rest still
contribute to the concatenation):
1. NBI_POD_IDENTITY env var (highest priority). Admins on K8s where
/etc/machine-id and uid are shared across co-tenants pin this to
a per-pod or per-user value (JupyterHub username, spawn token,
Pod metadata.uid via the downward API).
2. /etc/machine-id or /var/lib/dbus/machine-id when mounted.
3. socket.gethostname() as a last resort (pod name in K8s).
POSIX uid is always included; getattr-guarded for Windows.
The on-disk blob format is unchanged (16 salt bytes + Fernet
ciphertext). New encrypt_user_secret / decrypt_user_secret helpers
wrap the existing crypto with the derived password. Legacy blobs
(those written before this change) decrypt via a fallback to the
bare password and are silently rewritten under the v2 KDF on the
next read, so existing users do not have to re-login.
The rewrite goes through the existing _atomic_write_json helper so a
crash mid-rewrite (K8s OOMKill during the silent upgrade) cannot
leave a torn user-data.json on disk. The legacy-fallback exception
handler catches InvalidToken only, so a corrupted blob or programmer
error surfaces instead of being masked as "needs the legacy path."
Admin-guide and PRIVACY notes updated to call out the K8s
shared-machine-id reality and recommend NBI_POD_IDENTITY for
multi-tenant deployments.
05f10cd to
f78be29
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The encrypted Copilot token at
~/.jupyter/nbi/user-data.jsonwas keyed via PBKDF2 fromNBI_GH_ACCESS_TOKEN_PASSWORDwhose default value is the public literalnbi-access-token-password. Every install with the default ships the same key. An attacker who could read another tenant'suser-data.json(shared NFS home, accidental backup leak, errant git commit) could decrypt the OAuth access token directly: the ciphertext is decryption-equivalent to plaintext for anyone holding the known password.Solution
Mix a per-pod + per-user secret into the KDF password input before PBKDF2. The on-disk blob format is unchanged (16 random salt bytes + Fernet ciphertext); the change is purely in what string feeds
PBKDF2HMAC.derive().Resolution order in
_machine_user_secret()(first non-empty wins, the rest still contribute to the concatenation):NBI_POD_IDENTITYenv var (highest priority). On a multi-tenant Kubernetes deployment,/etc/machine-idtypically resolves to the host node's value (shared across co-tenants) and every container runs as uid1000. The automatic sources collapse to a value shared across co-tenants on the same node. Admins pinNBI_POD_IDENTITYto a value that stays stable across pod restarts (JupyterHub username, per-user spawn secret) for real cross-tenant isolation. Avoidmetadata.uidfrom the downward API: that value rotates on every restart and would invalidate stored tokens on each restart./etc/machine-idor/var/lib/dbus/machine-idwhen mounted.socket.gethostname()as a last resort. In Kubernetes the hostname equals the pod name, which differs across pods and gives per-pod isolation even though it is not a confidential value.POSIX uid is always included. The Windows path is
getattr-guarded so a missingos.getuiddoes not crash the read path.New
encrypt_user_secret/decrypt_user_secrethelpers innotebook_intelligence/util.pywrap the existingencrypt_with_password/decrypt_with_passwordwith the derived password. Thegithub_copilot.pyread/write call sites switch to the new helpers; tests on the bare helpers continue to cover the underlying crypto.Legacy migration: blobs encrypted before this change decrypt via a fallback to the bare password and are silently rewritten under the v2 KDF on the next read. The legacy-fallback exception handler catches
InvalidTokenonly, so a corrupted blob or programmer error surfaces loudly instead of being masked. The rewrite goes through the existing_save_user_datahelper (introduced in PR #293) so the 0o600 file mode contract is preserved across the upgrade and a crash mid-rewrite (a K8s OOMKill during the silent upgrade) cannot leave a tornuser-data.jsonon disk. If the rewrite itself fails (read-only filesystem, transient I/O), the read still returns the decrypted plaintext so the user's login completes; the next read falls back to the legacy path again until the underlying I/O problem clears.Testing
tests/test_user_secret_kdf.py: 13 cases covering round-trip, cross-pod isolation by machine-id and uid, hostname fallback when/etc/machine-idis absent, legacy-blob fallback (and refuses withallow_legacy=False),NBI_POD_IDENTITYoverriding the automatic sources, an end-to-end silent-upgrade test that drivesread_stored_github_access_tokenagainst a fakeuser_data_fileontmp_pathand verifies the on-disk bytes change between calls, and a rewrite-failure test that asserts the read still returns plaintext whenwrite_github_access_tokenraises.Risks / follow-ups
/etc/machine-idmounted, hostname changes,NBI_POD_IDENTITYunset or pointed at a rotating value likemetadata.uid) will fail to decrypt v2 blobs and the user is silently logged out. The admin-guide and PRIVACY notes now document this, recommend stableNBI_POD_IDENTITYvalues, and explicitly warn againstmetadata.uidfrom the downward API._save_user_data(per PR security: enforce 0o600 on the encrypted GitHub token file #293) on every write including the silent legacy-blob upgrade. The merge into this branch adopts that helper at both write sites./etc/machine-idis shared across co-tenants and uid pins to 1000. TheNBI_POD_IDENTITYenv knob lets admins close that gap with a one-line spawn config; the admin guide section calls this out.