fix(agentpool)!: default empty agent keyType to RSA-3072, not Ed25519 (Q109)#227
Merged
Conversation
… (Q109) generateKey treated an empty/unspecified KeyType as Ed25519, a secure-by-default regression: Ed25519 agents cannot decrypt the RSA-OAEP session key the broker sends, so any caller constructing a key with an empty keyType silently dropped the AES-256-CBC defense-in-depth layer. Only the CLI flag default (rsa) held the secure choice. Make the library default match the CLI: an empty or unrecognised keyType now yields RSA-3072. Ed25519 requires the explicit KeyTypeEd25519 opt-in. Adds a unit test pinning the contract and updates the godoc/comments at all construction sites; release-1.0.md bucket G marks the regression resolved.
The AGC integration suite constructed RunnerGroupReconciler with a zero-value AgentKeyType. Before Q109 that meant fast Ed25519 keys; now empty defaults to RSA-3072, whose per-agent generation cost blows the suite's Eventually timeouts on CI (TestAGC_SIGTERM_DeletesAllSessions timed out). These suites exercise the session/reconcile lifecycle, not crypto, so set AgentKeyType explicitly to Ed25519 to keep them fast. The production default stays RSA-3072.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The library
generateKey(cmd/agc/internal/agentpool/crypto.go) treated an empty/unspecifiedKeyTypeas Ed25519. This PR makes the library default RSA-3072, matching the CLI flag default (--agent-key-type=rsa). Ed25519 now requires the explicitKeyTypeEd25519opt-in.Why
This was a secure-by-default regression of the exact kind
CLAUDE.md's security principles forbid: Ed25519 agents cannot decrypt the RSA-OAEP session key the broker sends, so they lose the AES-256-CBC defense-in-depth layer on job message bodies. Only the CLI flag default held the secure choice — any new constructor or caller that passed an emptykeyType(e.g.NewPool(..., ""), or theRunnerGroupReconciler.AgentKeyTypezero value) silently regressed to the insecure path.RSA-3072 is and stays the default; Ed25519 remains a documented explicit opt-in (Q11, D-5,
docs/plan/security.mdM-11).Changes
crypto.go: empty/unrecognisedKeyType→ RSA-3072; onlyKeyTypeEd25519selects Ed25519. Updated godoc spelling out the regression rationale.crypto_test.go(new): table test pinning the contract — empty,"rsa", and an unrecognised value all yield RSA-3072 (3072-bit verified); explicit"ed25519"still yields Ed25519 (opt-in preserved).pool.go/runnergroup_controller.go: corrected the "empty defaults to Ed25519" comments to RSA.suite_integration_test.go: the AGC integration suite constructed the reconciler with a zero-valueAgentKeyType(previously fast Ed25519). It now opts intoKeyTypeEd25519explicitly — these suites exercise the session/reconcile lifecycle, not crypto, and RSA-3072 generation per agent blew the suite'sEventuallytimeouts on CI (TestAGC_SIGTERM_DeletesAllSessions). The production default stays RSA-3072.docs/plan/release-1.0.md: bucket G marks the Q109 regression resolved.docs/STATUS.md: removed the completed Q109 row (isolated commit).Caller audit: all
generateKey/NewPool/KeyTypesites checked. The unit-test fixtures passKeyTypeEd25519explicitly; the one path that depended on empty→Ed25519 was the integration suite (fixed above). The change only makes default-constructed keys more secure.Testing
make check— green.agentpoolpackage under the-racedetector — green (crypto is part of the multiplexing/crypto core).make -C cmd/agc test-integration— green (includingTestAGC_SIGTERM_DeletesAllSessions).