Skip to content

Move RELAY_SECRET to endpoint worker_init (no task-arg credentials)#25

Merged
Anas321 merged 2 commits into
mainfrom
refactor/relay-secret-worker-init
Jun 4, 2026
Merged

Move RELAY_SECRET to endpoint worker_init (no task-arg credentials)#25
Anas321 merged 2 commits into
mainfrom
refactor/relay-secret-worker-init

Conversation

@Anas321

@Anas321 Anas321 commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Refactor the HPC streaming control plane so no relay credentials traverse Globus Compute's AMQP channel.

  • remote_vllm_streaming now reads RELAY_SECRET from os.environ on the Lakeshore endpoint, matching the existing RELAY_ENCRYPTION_KEY pattern.
  • gce.submit(...) no longer passes RELAY_SECRET as a task argument.
  • Unused RELAY_SECRET import dropped from globus_compute_client.py.

Why

Previously the channel-access token (RELAY_SECRET) traveled in Globus Compute task payloads while the encryption key (RELAY_ENCRYPTION_KEY) did not. The asymmetry was a defense-in-depth choice rather than a principled requirement. After this change both credentials live in the endpoint's worker_init environment, so neither is visible in any task record. Eliminates a class of attack where a leaked task record yields an authenticated relay session.

Deployment requirement

Operators must add export RELAY_SECRET=<token> to ~/.globus_compute/<endpoint>/config.yaml worker_init (alongside the existing RELAY_ENCRYPTION_KEY export), then restart the endpoint:
```bash
globus-compute-endpoint restart lakeshore-research
```

The submitter side (STREAM middleware / proxy / hpc-as-api) still reads RELAY_SECRET from its own .env to authenticate its consumer-side connection to the relay. No change there.

Test plan

  • Syntax/import checks pass
  • STREAM relay integration test (tests/test_relay_integration.py) already calls remote_vllm_streaming without relay_secret arg — compatible
  • hpc-as-api full test suite (28 tests) passes after the analogous change in that repo
  • End-to-end HPC token streaming test once Lakeshore endpoint is restarted

Companion change for hpc-as-api: https://github.com/uicacer/hpc-as-api (v0.3.0 release)

🤖 Generated with Claude Code

Refactor remote_vllm_streaming so the Lakeshore worker reads RELAY_SECRET
from os.environ on the endpoint (set in worker_init), the same way it
already reads RELAY_ENCRYPTION_KEY. Drop RELAY_SECRET from the gce.submit()
positional args and from the imports.

After this change, no relay credentials traverse Globus Compute's AMQP
channel — both the channel-access token (RELAY_SECRET) and the AES-256-GCM
payload key (RELAY_ENCRYPTION_KEY) are pre-provisioned on the HPC endpoint.

Deployment requirement: add `export RELAY_SECRET=...` to
~/.globus_compute/<endpoint>/config.yaml worker_init alongside the existing
RELAY_ENCRYPTION_KEY export, then restart the endpoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pip-audit on the security workflow flagged two CVEs in the transitive
aiohttp dependency. Both are fixed in 3.14.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Anas321 Anas321 merged commit 35101b1 into main Jun 4, 2026
3 checks passed
@Anas321 Anas321 deleted the refactor/relay-secret-worker-init branch June 4, 2026 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant