Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 13 additions & 10 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,12 @@ terraform/platform/
### Terraform — Org (human-applied, no CI)
```
terraform/org/
main.tf AWS Organizations, SCPs
providers.tf Provider config
variables.tf Variables
backend.tf Separate state key
main.tf AWS Organizations, SCPs
identity-center.tf IAM Identity Center, permission sets, ABAC (team attribute from SAML)
cloudtrail.tf CloudTrail trail + S3 bucket
providers.tf Provider config
variables.tf Variables
backend.tf Separate state key
```

### Terraform — State (bootstrapped)
Expand Down Expand Up @@ -260,13 +262,13 @@ The SA JSON key is at `/javabin/platform/google-admin-sa`, the impersonation tar
| 0a | AWS Discovery | **Done** |
| 0b | Bootstrap State Backend | **Done** — S3 backend live |
| 0c | Organizations + Permission Boundary | **Done** — org enabled, boundary deployed, SCP deferred |
| 1 | Identity (Google + Identity Center + Cognito) | **Partially done** — GCP SA with domain-wide delegation configured, GitHub App credentials in SSM. Cognito pool Terraform exists in `identity/` and is wired in `main.tf`, but not yet applied with Google IdP config. Identity Center lives in `terraform/org/`. |
| 1 | Identity (Google + Identity Center + Cognito) | **Deployed** — GCP SA with domain-wide delegation, Identity Center with ABAC + 3 permission sets in `terraform/org/`. Cognito pool TF exists but not yet applied (needs Google OAuth client). |
| 2a | Networking | **Deployed** — VPC, subnets, NAT |
| 2b | Ingress | **Deployed** — ALB + ACM cert |
| 2c | IAM / OIDC | **Deployed** — 4 CI roles + per-app roles |
| 2c | IAM / OIDC | **Deployed** — 5 CI roles (infra, per-app, deploy, override-approver, registry) |
| 2d | Compute | **Deployed** — ECS cluster + ECR repos |
| 2e | Monitoring | **Deployed** — GuardDuty, Security Hub, Config, SNS |
| 2f | Lambda Functions | **Deployed** — 5 working + team-provisioner (stub only) |
| 2f | Lambda Functions | **Deployed** — 6 working (Google/GitHub/Budget sync live, Cognito/IdC sync not yet implemented in Lambda) |
| 2g | Platform CI | **Done** — plan → LLM review → apply pipeline working |
| 3a | Reusable Terraform Modules | **Code done** — 12 modules in repo |
| 3b | GitHub Actions Workflows | **Code done** — 14 reusable workflows |
Expand All @@ -277,9 +279,10 @@ The SA JSON key is at `/javabin/platform/google-admin-sa`, the impersonation tar
| 4 | App Onboarding | **Partially working** — platform-test-app full pipeline passes (plan → review → apply → docker-build), ECS deploy fails on service stabilization |

### Known Issues
- **ECS deploy stabilization**: platform-test-app task registers but service fails health check — likely networking or port config
- **Cognito pools not yet applied**: `identity/` has Terraform wired in `main.tf`, but requires `google_client_id`/`google_client_secret`/`certificate_arn` variables
- **`registered_app_repos` manually managed**: Per-repo IAM roles require entries in this variable. No automated mechanism yet — add repos manually to `registered-apps.auto.tfvars`
- **ECS deploy stabilization**: platform-test-app task registers but service fails health check
- **Cognito pools not yet applied**: TF exists but needs Google OAuth client credentials
- **Team provisioner Lambda**: Google/GitHub/Budget sync working. Cognito and Identity Center sync functions are stubs — need implementation to create groups and assign members
- **`registered_app_repos` manually managed**: Being replaced with team-scoped IAM roles (repo→team resolved via GitHub API at runtime)

## Agent Guidelines

Expand Down
30 changes: 30 additions & 0 deletions docs/team-lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Team Lifecycle — Open Questions

## What works today

- **Member added**: Lambda adds to Google Group + GitHub team
- **Member removed**: Lambda removes from Google Group + GitHub team
- **Team created**: Lambda creates Google Group, GitHub team, AWS Budget
- **Description/budget changed**: Lambda updates in place

## What doesn't work yet

### Team deletion

Deleting a team YAML from the registry does nothing. The Lambda only processes files that exist — a deleted file can't be read. Orphaned resources remain:

- Google Group (`team-{name}@java.no`)
- GitHub team
- AWS Budget
- IAM role (once team-scoped roles are implemented)

### Repos after team deletion

Repos that were in the deleted GitHub team lose their team association. CI stops working because there's no IAM role to assume. The repos themselves are untouched.

## Decisions needed

1. **Should team deletion be destructive?** Auto-delete Google Group, GitHub team, budget, IAM role? Or archive/disable?
2. **Should deletion be blocked if the team still has repos?** Force teams to remove repos before deleting.
3. **How to detect deletions in CI?** The registry workflow could compare deleted files in the git diff and send a separate "delete" event to the Lambda.
4. **Grace period?** Should there be a cooldown before resources are actually removed?
12 changes: 1 addition & 11 deletions scripts/post-review-comment.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,16 @@
#
# Usage: post-review-comment.sh
#
# Reads review-result.json and review-output.txt from current directory.
# Reads review-output.txt from current directory.
# Env: GH_TOKEN (or gh auth), GITHUB_REPOSITORY, PR_NUMBER

set -e

RISK=$(jq -r '.risk // "FAILED"' review-result.json 2>/dev/null || echo "FAILED")
REVIEW=$(cat review-output.txt 2>/dev/null || echo "LLM review output not available.")

case "$RISK" in
LOW) EMOJI="🟢" ;;
MEDIUM) EMOJI="🟡" ;;
HIGH) EMOJI="🔴" ;;
*) EMOJI="⚪" ;;
esac

cat > /tmp/review-comment.md <<EOF
## LLM Plan Review

**Risk: ${EMOJI} ${RISK}**

${REVIEW}
EOF

Expand Down
9 changes: 7 additions & 2 deletions terraform/lambda-src/team_provisioner/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,8 +148,13 @@ def _get_google_access_token():
data=data,
headers={"Content-Type": "application/x-www-form-urlencoded"},
)
with urllib.request.urlopen(req) as resp:
token = json.loads(resp.read())["access_token"]
try:
with urllib.request.urlopen(req) as resp:
token = json.loads(resp.read())["access_token"]
except urllib.error.HTTPError as e:
body_text = e.read().decode("utf-8", errors="replace")
logger.error("Google OAuth token exchange failed: %d %s", e.code, body_text)
raise

_credential_cache["_google_token"] = {
"token": token,
Expand Down
4 changes: 2 additions & 2 deletions terraform/platform/lambdas/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,8 @@ resource "aws_iam_role_policy" "team_provisioner" {
Effect = "Allow"
Action = [
"budgets:CreateBudget",
"budgets:DescribeBudget",
"budgets:UpdateBudget",
"budgets:ModifyBudget",
"budgets:ViewBudget",
]
Resource = "arn:aws:budgets::${var.aws_account_id}:budget/javabin-team-*"
},
Expand Down
103 changes: 0 additions & 103 deletions terraform/platform/monitoring/main.tf
Original file line number Diff line number Diff line change
@@ -1,106 +1,3 @@
################################################################################
# CloudTrail — required for EventBridge to receive API call events
#
# Without a trail, EventBridge rules matching "AWS API Call via CloudTrail"
# never fire. This is the single trail (free tier) with management events only.
################################################################################

resource "aws_s3_bucket" "cloudtrail" {
bucket = "${var.project}-cloudtrail-${var.aws_account_id}"

tags = {
Name = "${var.project}-cloudtrail"
}
}

resource "aws_s3_bucket_server_side_encryption_configuration" "cloudtrail" {
bucket = aws_s3_bucket.cloudtrail.id

rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}

resource "aws_s3_bucket_public_access_block" "cloudtrail" {
bucket = aws_s3_bucket.cloudtrail.id

block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}

resource "aws_s3_bucket_lifecycle_configuration" "cloudtrail" {
bucket = aws_s3_bucket.cloudtrail.id

rule {
id = "expire-old-logs"
status = "Enabled"

expiration {
days = 90
}
}
}

resource "aws_s3_bucket_policy" "cloudtrail" {
bucket = aws_s3_bucket.cloudtrail.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AWSCloudTrailAclCheck"
Effect = "Allow"
Principal = { Service = "cloudtrail.amazonaws.com" }
Action = "s3:GetBucketAcl"
Resource = aws_s3_bucket.cloudtrail.arn
Condition = {
StringEquals = {
"aws:SourceArn" = "arn:aws:cloudtrail:${var.region}:${var.aws_account_id}:trail/${var.project}-trail"
}
}
},
{
Sid = "AWSCloudTrailWrite"
Effect = "Allow"
Principal = { Service = "cloudtrail.amazonaws.com" }
Action = "s3:PutObject"
Resource = "${aws_s3_bucket.cloudtrail.arn}/AWSLogs/${var.aws_account_id}/*"
Condition = {
StringEquals = {
"s3:x-amz-acl" = "bucket-owner-full-control"
"aws:SourceArn" = "arn:aws:cloudtrail:${var.region}:${var.aws_account_id}:trail/${var.project}-trail"
}
}
}
]
})
}

resource "aws_cloudtrail" "main" {
name = "${var.project}-trail"
s3_bucket_name = aws_s3_bucket.cloudtrail.id
is_multi_region_trail = true
enable_log_file_validation = true

# Send events to EventBridge (required for our rules to fire)
# This is enabled by default for management events when a trail exists,
# but being explicit about it
event_selector {
read_write_type = "All"
include_management_events = true
}

depends_on = [aws_s3_bucket_policy.cloudtrail]

tags = {
Name = "${var.project}-trail"
}
}

################################################################################
# SNS Topics for Alerts
################################################################################
Expand Down