feat: secure-by-default Cognito authentication for agentic platform API#71
Open
batchus wants to merge 10 commits into
Open
feat: secure-by-default Cognito authentication for agentic platform API#71batchus wants to merge 10 commits into
batchus wants to merge 10 commits into
Conversation
- EnableAuth parameter defaults to true (secure by default); set false only for the open blog/demo walkthrough - Cognito User Pool + app client + hosted-UI domain (conditional on auth) - auth.py: verify Cognito JWT in the Lambda (JWKS signature, issuer, audience, expiry, token_use, client_id); fails closed when enabled-but-unconfigured - Handler enforces the auth gate before any action routing; 401 on failure; internal self-invokes and CORS preflight bypass correctly - CORS: configurable AllowedOrigin + authorization header - deploy.sh: ENABLE_AUTH env override (defaults true) - 25 unittest cases (no pytest dependency) covering auth on/off, token validation failures, fail-closed, and a static check that there are no unauthenticated endpoints / public function URLs / open default
- auth.js: OAuth2 authorization-code + PKCE login against the Cognito Hosted UI, sessionStorage token handling, authedFetch() that attaches the bearer token and redirects to login on 401/missing token. No-op when VITE_AUTH_ENABLED!=true so the blog/demo build is unchanged. - Route all ~25 /orchestrate calls through authedFetch across App + components. - App gates render on auth when enabled (handles ?code= redirect, shows Sign out). Build-time flags: VITE_AUTH_ENABLED, VITE_COGNITO_DOMAIN, VITE_COGNITO_CLIENT_ID, VITE_AUTH_REDIRECT_URI. Verified both auth-off and auth-on builds compile.
…ECTURE, SECURITY - README: Authentication section (enable/disable, user creation, UI build flags, Lambda-enforcement design note) + ENABLE_AUTH config row - ARCHITECTURE: Authentication data flow, auth.py component, Cognito service, updated project structure - SECURITY: replace the stale 'REST API uses IAM' section with the real agentic platform model (Cognito JWT verified in Lambda, secure by default, fails closed); clarify the container REST API is the separate IAM-auth one; add auth checklist items
Replace SAM HttpApi sugar with raw ApiGatewayV2 Api/Integration/Route/Stage so the Cognito JWT authorizer attaches conditionally (AuthorizationType JWT/NONE). Rejects unauthenticated requests at the gateway edge before the Lambda runs; auth.py still verifies as defense-in-depth and trusts gateway-validated claims.
- Merge feat/metrics-knowledge-items-skill-format (PR #67 review fixes) into the auth branch so it stays a superset - Apply the same batch:SubmitJob ARN scoping to the AgentCoreExecutionRole for consistency with the AsyncInvokeRole change
Scoping batch:ListJobs to a job-queue ARN silently denied the call (AWS Batch does not support resource-level permissions for ListJobs/DescribeJobs), which made metrics type=jobs return all zeros. Reverted ListJobs + DescribeJobs to "*" with an explanatory comment; SubmitJob stays scoped to the queue/definition ARNs (verified: KI submit still works). Also log (instead of silently swallowing) errors in _get_job_counts so future IAM issues surface.
…ambda-only) The README/ARCHITECTURE/SECURITY notes and auth.py docstring still described the earlier Lambda-only enforcement approach. Updated to reflect the actual design: primary enforcement is the API Gateway Cognito JWT authorizer (rejects at the edge before the Lambda runs), with in-Lambda verification as defense-in-depth. Raw ApiGatewayV2 resources make the authorizer conditional on EnableAuth.
…ations Document the option to make the API private for environments with private AWS network access, with the trade-offs: HTTP API v2 has no private-endpoint support (requires migrating to REST API + VPC endpoint), a browser UI can't reach a private endpoint (UI must also go internal), and the WAF association caveat. Recommends public HTTP API + Cognito JWT (+ optional CloudFront-fronted WAF) for the public-UI deployment, private REST API only when the whole path is internal.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds secure-by-default authentication to the agentic ATX platform API.
Backend
AWS::ApiGatewayV2resources so it can beattached conditionally): when
EnableAuth=truethe/orchestrateroute requires avalid Cognito JWT and rejects unauthenticated requests with
401at the gateway edge.self-signup disabled, admin-create only).
auth.pyverifies the JWT in the Lambda as defense-in-depth (JWKS signature,issuer, audience, expiry, token_use, client_id) and trusts gateway-validated claims
when present. Fails closed if enabled-but-misconfigured.
authorizationheader allowed.batch:SubmitJobscoped to the queue/definition ARNs (List/Describe stay*— AWSBatch has no resource-level support for those).
Frontend
auth.js: Cognito Hosted UI login (OAuth2 auth-code + PKCE), token handling, andauthedFetchthat attaches the bearer token and redirects on 401. No-op unlessVITE_AUTH_ENABLED=true, so the open blog/demo build is unchanged./orchestratecalls routed throughauthedFetch; App gates render on auth witha Sign out control.
Secure by default
EnableAuthdefaults totrue. Deploy withENABLE_AUTH=falsefor the openblog/demo walkthrough.
Tests
33 unittest cases (no pytest dependency): auth on/off, 401 on every HTTP action without
a valid token, gateway-claims trust, fail-closed, and static checks that there are no
unauthenticated endpoints / public function URLs.
Verified on dev account
Docs
README (auth setup, user creation, UI build flags), ARCHITECTURE (auth data flow,
Cognito service), SECURITY (replaces the stale "REST API uses IAM" section with the
real Cognito-JWT model).
Design note
Auth is enforced at the API Gateway JWT authorizer (edge) with in-Lambda verification
as defense-in-depth. Using raw ApiGatewayV2 resources (not SAM's HttpApi
Authshorthand) is what allows the authorizer to be conditional on
EnableAuth.