Conversation
yuvalk
commented
Mar 11, 2026
- feat: add PRODUCTION mode with 10 defensive security guards
- feat: add PRODUCTION mode with 10 defensive security guards
- test: add comprehensive production mode tests
- fix: update test_settings match string for K_SERVICE error message
- docs: add production mode to README and test matrix to tests/README
When PRODUCTION=true, the application enforces security guards at startup (fail-fast) and runtime: Vertex AI only, JWT validation, no debug, HTTPS URLs, PostgreSQL, HTTP MCP transport, JWT forwarding to MCP, no CORS middleware, SSO credentials, and DCR configuration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # src/lightspeed_agent/config/settings.py
Add a single PRODUCTION=true env var that enforces production-only configuration at startup via Pydantic model validators. All violations are collected and reported together so operators can fix everything at once. Guards enforced: 1. Force Vertex AI (disallow GOOGLE_API_KEY, require GOOGLE_CLOUD_PROJECT) 2. Force JWT validation (block SKIP_JWT_VALIDATION=true) 3. Disable debug (block DEBUG=true) 4. Force HTTPS on AGENT_PROVIDER_URL and MCP_SERVER_URL 5. Force PostgreSQL (block sqlite DATABASE_URL) 6. Force MCP http transport (block stdio mode) 7. Force JWT forwarding to MCP (block LIGHTSPEED_CLIENT_ID/SECRET) 8. No CORS middleware in both agent and marketplace handler apps 9. Require SSO credentials (RED_HAT_SSO_CLIENT_ID/SECRET) 10. Require DCR config (DCR_ENABLED, DCR_INITIAL_ACCESS_TOKEN, DCR_ENCRYPTION_KEY) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
30 tests covering all 10 production security guards: - Happy path with valid production config - Individual guard violation tests (guards 1-10) - Multiple simultaneous violations reported together - Production=false bypasses all guards - CORS middleware disabled in production (agent + marketplace apps) - MCP header provider forwards JWT in production mode Also adds pythonpath=["src"] to pytest config so the worktree's source code is loaded instead of the editable install from the main repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The K_SERVICE validator error message was updated during the production mode implementation. Update the existing test to match the new wording. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- README.md: add Production Mode section with guard table, example error output, and startup vs runtime enforcement explanation - tests/README.md: new file documenting the test suite with a full test matrix for all 30 production mode tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
marking as WIP for now, as I need to go over this with @luis5tb |
|
|
||
| # Production Mode | ||
| production: bool = Field( | ||
| default=False, |
There was a problem hiding this comment.
do we want to have this enabled by default, so that we cannot forget to enable it and you need to do something to disable it?
There was a problem hiding this comment.
any comment on this? I would still make this the default
| ) | ||
|
|
||
| # Guard 7: Force JWT forwarding to MCP (no service-account creds) | ||
| if self.lightspeed_client_id: |
There was a problem hiding this comment.
this code is now removed, we can skip self.lightspeed_client_id/secret
| "lightspeed-client-id": settings.lightspeed_client_id, | ||
| "lightspeed-client-secret": settings.lightspeed_client_secret, | ||
| } | ||
| # Skipped in production mode (Guard 7 enforces JWT forwarding) |
There was a problem hiding this comment.
same here, not needed anymore
|
Other than the rebase, we may need to update the deploy/cloudrun/README to ensure the production flag is described and enabled (or just the description and move it to enable by default so that nothing needs to be done regarding the deployment |
…hat_sso_issuer Both URLs are security-sensitive: marketplace_handler_url is used for DCR redirects and red_hat_sso_issuer carries client credentials for token introspection. Also sanitize URLs in error messages to prevent credential leakage.
The session database stores ADK sessions and conversation history. If session_database_url is set, it should also be checked for SQLite to match the guard's intent of enforcing PostgreSQL in production.
When both K_SERVICE and PRODUCTION=true are set, _block_skip_jwt_in_production and Guard 2 would both fire for skip_jwt_validation, producing duplicate errors. Skip the K_SERVICE check when production mode is active since Guard 2 already handles it.
In production mode, sending unauthenticated requests to MCP is both a security concern and an operational problem. Raise RuntimeError instead of returning empty headers so the user gets a clear auth error rather than a confusing MCP 401 response.
| @@ -0,0 +1,64 @@ | |||
| # Test Suite | |||
luis5tb
left a comment
There was a problem hiding this comment.
couple of minor things, but a rebase on main is needed
| ) | ||
|
|
||
| @model_validator(mode="after") | ||
| def _block_skip_jwt_in_production(self) -> Self: |
There was a problem hiding this comment.
Should we rename this method to _block_skip_jwt_on_cloud_run, since in case of production=True it early-returns with _enforce_production_guards