Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ jobs:
image_uri: ${{ steps.push.outputs.image_uri }}
image_tag: ${{ steps.tags.outputs.primary_tag }}
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- uses: actions/checkout@v6

- name: Generate GitHub App token
Expand All @@ -62,6 +65,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ inputs.aws_account_id }}:role/javabin-ci-app-broker
aws-region: ${{ inputs.aws_region }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Get deploy credentials from broker
id: broker
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/ecs-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ jobs:
name: ECS Deploy
runs-on: ubuntu-latest
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- name: Generate GitHub App token
id: app-token
uses: actions/create-github-app-token@v2
Expand All @@ -55,6 +58,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ inputs.aws_account_id }}:role/javabin-ci-app-broker
aws-region: ${{ inputs.aws_region }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Get deploy credentials from broker
id: broker
Expand Down
15 changes: 12 additions & 3 deletions .github/workflows/platform-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ jobs:
plan_sha256: ${{ steps.upload.outputs.plan_sha256 }}
risk_level: ${{ steps.review.outputs.risk_level }}
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- uses: actions/checkout@v6
with:
fetch-depth: 0
Expand Down Expand Up @@ -71,7 +74,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ env.AWS_ACCOUNT_ID }}:role/javabin-ci-infra-plan
aws-region: ${{ env.AWS_REGION }}
role-session-name: javabin-platform-plan-${{ github.run_id }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Sync registered teams from GitHub org
if: steps.changes.outputs.has_infra_changes == 'true' && github.ref == 'refs/heads/main'
Expand Down Expand Up @@ -146,6 +149,9 @@ jobs:
needs.plan.outputs.has_changes == 'true'
environment: production
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- uses: actions/checkout@v6

- uses: hashicorp/setup-terraform@v4
Expand All @@ -157,7 +163,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ env.AWS_ACCOUNT_ID }}:role/javabin-ci-infra
aws-region: ${{ env.AWS_REGION }}
role-session-name: javabin-apply-${{ github.run_id }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Check risk level
env:
Expand Down Expand Up @@ -194,6 +200,9 @@ jobs:
runs-on: ubuntu-latest
if: github.event_name == 'schedule'
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- uses: actions/checkout@v6

- uses: hashicorp/setup-terraform@v4
Expand All @@ -205,7 +214,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ env.AWS_ACCOUNT_ID }}:role/javabin-ci-infra
aws-region: ${{ env.AWS_REGION }}
role-session-name: javabin-drift-${{ github.run_id }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Terraform Init
working-directory: ${{ env.TF_ROOT }}
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/tf-plan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ jobs:
env:
PLAN_BUCKET: javabin-ci-plan-artifacts-${{ inputs.aws_account_id }}
steps:
- name: Set session name
run: echo "SESSION_NAME=$(echo "${GITHUB_ACTOR}-${GITHUB_SHA:0:8}-${GITHUB_RUN_ID}" | head -c 64)" >> "$GITHUB_ENV"

- uses: actions/checkout@v6
with:
ref: ${{ github.ref }}
Expand All @@ -60,6 +63,7 @@ jobs:
with:
role-to-assume: arn:aws:iam::${{ inputs.aws_account_id }}:role/javabin-ci-app-broker
aws-region: ${{ inputs.aws_region }}
role-session-name: ${{ env.SESSION_NAME }}

- name: Get team credentials from broker
id: broker
Expand Down
9 changes: 7 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ terraform/platform/
iam/ GitHub OIDC, CI roles, permission boundary
compute/ ECS cluster, ECR base config
monitoring/ SNS, EventBridge, Config, GuardDuty, Security Hub
lambdas/ slack-alert, cost-report, daily-cost-check, compliance-reporter, override-cleanup, team-provisioner, apply-gate, securityhub-summary, password-set
lambdas/ slack-alert, cost-report, daily-cost-check, compliance-reporter, resource-tagger, budget-enforcer, override-cleanup, team-provisioner, apply-gate, securityhub-summary, password-set, ci-broker
identity/ Cognito user pools (internal + external). Identity Center is in terraform/org/
```

Expand Down Expand Up @@ -186,6 +186,9 @@ terraform/state/
| `team-provisioner` | Syncs Google Groups, GitHub teams, AWS Budgets from registry team YAML |
| `securityhub-summary` | Weekly Security Hub findings summary (Monday 08:00 UTC) |
| `password-set` | Self-service password-set for new hero accounts (Function URL) |
| `budget-enforcer` | Scales ECS services to zero when team exceeds 200% budget |
| `resource-tagger` | EventBridge-triggered, auto-tags created-by + commit on new resources |
| `ci-broker` | Validates team membership, vends short-lived team role credentials |

### Scripts
| Script | What |
Expand Down Expand Up @@ -228,6 +231,8 @@ Scheduled:
EventBridge (Create/Run) ──► compliance-reporter (report to Slack, no auto-fix)
Hourly ──► override-cleanup (delete stale SSM override tokens)
Registry merge ──► team-provisioner (Google/GitHub/Budget/Cognito/Identity Center sync + hero provisioning)
AWS Budgets (200%) ──► budget-enforcer Lambda ──► ECS scale-to-zero + #javabin-cost-alerts
EventBridge (Create/Run) ──► resource-tagger Lambda ──► Tag created-by + commit
```

## SSM Parameters
Expand Down Expand Up @@ -274,7 +279,7 @@ The SA JSON key is at `/javabin/platform/google-admin-sa`, the impersonation tar
| 2c | IAM / OIDC | **Deployed** — 6 CI roles (infra, infra-plan, per-app, deploy, override-approver, registry) |
| 2d | Compute | **Deployed** — ECS cluster + ECR repos |
| 2e | Monitoring | **Deployed** — GuardDuty, Security Hub, Config, SNS |
| 2f | Lambda Functions | **Deployed** — 8 functions (Google/GitHub/Budget/Cognito/Identity Center sync live) |
| 2f | Lambda Functions | **Deployed** — 11 functions (budget-enforcer, resource-tagger, ci-broker added; Google/GitHub/Budget/Cognito/Identity Center sync live) |
| 2g | Platform CI | **Done** — plan → LLM review → apply pipeline working |
| 3a | Reusable Terraform Modules | **Code done** — 12 modules in repo |
| 3b | GitHub Actions Workflows | **Code done** — 14 reusable workflows |
Expand Down
21 changes: 18 additions & 3 deletions docs/app-yaml-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,17 +116,31 @@ resources:

#### databases

DynamoDB tables.
DynamoDB tables (default) or RDS PostgreSQL instances.

```yaml
resources:
databases:
- name: sessions
hash_key: id # required
range_key: timestamp # optional
hash_key: id # required (DynamoDB)
range_key: timestamp # optional (DynamoDB)
env: SESSIONS_TABLE

- name: main
engine: postgres # "dynamodb" (default) or "postgres"/"postgresql"
instance_class: db.t3.micro # RDS only, default: db.t3.micro
allocated_storage: 20 # GB, RDS only, default: 20
engine_version: "16" # PostgreSQL version, RDS only, default: "16"
backup_retention_period: 7 # days, RDS only, default: 7
multi_az: false # RDS only, default: false
deletion_protection: true # RDS only, default: true
env: DATABASE_URL
```

DynamoDB and PostgreSQL entries can coexist in the same `databases` list. Entries without `engine` (or with `engine: dynamodb`) use the DynamoDB module. Entries with `engine: postgres` or `engine: postgresql` use the RDS module.

RDS instances use `manage_master_user_password = true`, which stores the auto-generated master password in Secrets Manager. The ECS task role automatically receives IAM policies for `rds-db:connect` and `secretsmanager:GetSecretValue` on the password secret.

#### secrets

Secrets Manager secrets. Value is set manually after creation.
Expand Down Expand Up @@ -354,6 +368,7 @@ Generated files have a `# GENERATED FROM app.yaml` marker. The script only overw
| S3 bucket | `javabin-{bucket_name}-{account_id}` |
| DynamoDB table | `javabin-{table_name}` |
| SQS queue | `javabin-{queue_name}` |
| RDS instance | `{db_name}` (identifier) |
| Secrets Manager | `javabin/{secret_name}` |
| IAM task role | `javabin-{name}` |
| CloudWatch logs | `/ecs/javabin/{name}` |
Expand Down
36 changes: 33 additions & 3 deletions docs/lambda-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,40 @@ Queries Security Hub for active findings at HIGH and CRITICAL severity, aggregat

## team-provisioner

**Trigger:** (Future) Registry repo merge events
**Purpose:** Syncs team definitions across Google Workspace, GitHub, Cognito, and IAM.
**Trigger:** Registry repo merge events (via `provision-app.yml` workflow dispatch)
**Purpose:** Syncs team definitions from registry YAML across Google Groups, GitHub teams, AWS Budgets (80% warning + 200% enforcement thresholds), Cognito groups, and Identity Center groups. Also handles hero account provisioning.

**Status:** Stub only — logs event and returns success. Blocked on Google Admin access.
| SSM Parameter | Purpose |
|---------------|---------|
| `/javabin/platform/google-admin-sa` | GCP service account JSON key (domain-wide delegation) |
| `/javabin/platform/google-admin-email` | Admin email for Google Admin SDK impersonation |
| `/javabin/platform/github-app-id` | GitHub App ID for team management |
| `/javabin/platform/github-app-key` | GitHub App private key |
| `/javabin/platform/github-app-client-secret` | GitHub App client secret |

## budget-enforcer

**Trigger:** SNS notification from AWS Budgets (200% threshold)
**Purpose:** Scales a team's ECS services to `desired_count=0` when spending exceeds 200% of their monthly budget. Does NOT destroy resources — services can be scaled back up after resolution.

**Flow:** Parse budget name (`javabin-team-{team}`) → list ECS services tagged with team → scale to zero → post Slack alert.

| SSM Parameter | Channel |
|---------------|---------|
| `/javabin/slack/platform-cost-alerts-webhook` | #javabin-cost-alerts |

**Environment vars:** `ECS_CLUSTER` (default: `javabin-platform`)

## resource-tagger

**Trigger:** EventBridge rule matching all AWS service creation events (`{"prefix": "aws."}` source, `Create*`/`Run*` event names)
**Purpose:** Auto-tags newly created AWS resources with `created-by` (actor) and `commit` (SHA) parsed from the CloudTrail session name. Tags are set via AWS Resource Groups Tagging API, outside Terraform management — no drift or plan noise.

**Session name format:** `{actor}-{sha8}-{run_id}` (enriched in CI workflows)

Idempotent: skips resources that already have a `created-by` tag (preserves original creator).

**Environment vars:** `AWS_ACCOUNT_ID`

## Shared Module: pricing

Expand Down
3 changes: 2 additions & 1 deletion docs/platform-modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,11 @@ SNS topics, EventBridge rules, Config, GuardDuty, Security Hub.
| GuardDuty | Threat detection |
| Security Hub | Findings aggregation |
| `javabin-alert-dedup` DynamoDB | Deduplication table used by slack-alert Lambda |
| Cost allocation tags | `aws_ce_cost_allocation_tag` resources activating 7 tags: team, service, repo, environment, managed-by, created-by, commit |

## lambdas

8 Lambda functions — see [lambda-functions.md](lambda-functions.md) for details.
11 Lambda functions — see [lambda-functions.md](lambda-functions.md) for details.

## identity

Expand Down
39 changes: 39 additions & 0 deletions docs/reusable-modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ additional_policy_jsons = {
}
```

**`trusted_services`** — controls which AWS service can assume the role. Default: `["ecs-tasks.amazonaws.com"]`. Can be set to `["ec2.amazonaws.com"]` or `["lambda.amazonaws.com"]` via `compute.trusted_service` in app.yaml. Enables cross-compute roles so EC2 instances and Lambda functions get the same auto-wired access policies.

**Outputs:** `role_arn`, `role_name`, `role_id`

## ecs-service
Expand All @@ -61,6 +63,8 @@ ECS Fargate task definition + service + CloudWatch log group.

Supports `environment` (map) and `secrets` (map of name => ARN) for container configuration.

**Tag propagation:** `enable_ecs_managed_tags = true` and `propagate_tags = "SERVICE"` ensure Fargate task-level compute costs are attributed to the team via Cost Explorer.

**Outputs:** `service_name`, `task_definition_arn`, `log_group_name`

## service-bucket
Expand Down Expand Up @@ -91,6 +95,41 @@ SQS queue + dead-letter queue with configurable retention and visibility timeout
**Naming:** `{project}-{name}` (queue), `{project}-{name}-dlq` (DLQ)
**Outputs:** `queue_url`, `queue_arn`, `dlq_url`, `dlq_arn`, `access_policy_json`

## service-rds

RDS PostgreSQL instance in private subnets.

**Inputs:**

| Input | Default |
|-------|---------|
| `name` | required |
| `engine_version` | `"16"` |
| `instance_class` | `db.t3.micro` |
| `allocated_storage` | 20 GB |
| `subnet_ids` | required |
| `vpc_id` | required |
| `allowed_security_group_ids` | required |
| `backup_retention_period` | 7 |
| `multi_az` | false |
| `deletion_protection` | true |

**Password:** Managed by AWS via `manage_master_user_password = true` (Secrets Manager).

**Outputs:** `endpoint`, `port`, `db_name`, `access_policy_json`, `security_group_id`

**Auto-wiring:** `access_policy_json` grants `rds-db:connect` + `secretsmanager:GetSecretValue`. Auto-attached to task role via `collect:access_policy_json`.

**app.yaml:**
```yaml
databases:
- name: main
engine: postgres
instance_class: db.t3.micro
allocated_storage: 20
engine_version: "16"
```

## service-alarm

CloudWatch alarms for ECS services: CPU high, memory high, unhealthy targets, 5xx errors.
Expand Down
8 changes: 5 additions & 3 deletions scripts/ensure-tf-boilerplate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,11 @@ provider "aws" {

default_tags {
tags = {
project = "${REPO_NAME}"
team = "${TEAM}"
managed-by = "terraform"
team = "${TEAM}"
service = "${REPO_NAME}"
repo = "${GITHUB_REPOSITORY}"
environment = "production"
managed-by = "terraform"
}
}
}
Expand Down
31 changes: 28 additions & 3 deletions scripts/expand-modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,25 @@

GENERATED_MARKER = "# GENERATED FROM app.yaml — do not edit, changes will be overwritten"

# Engine aliases — these all resolve to "postgres" for registry matching
_POSTGRES_ENGINES = {"postgres", "postgresql"}


def _item_matches_engine_filter(entry, item):
"""Check if a YAML list item matches the entry's engine_filter.

If the entry has no engine_filter, the item always matches.
Items without an explicit 'engine' field default to 'dynamodb'
for backward compatibility with existing DynamoDB entries.
"""
engine_filter = entry.get("engine_filter")
if engine_filter is None:
return True
item_engine = (item.get("engine") or "dynamodb").lower()
if item_engine in _POSTGRES_ENGINES:
item_engine = "postgres"
return item_engine == engine_filter


# ---------------------------------------------------------------------------
# YAML helpers
Expand Down Expand Up @@ -592,8 +611,11 @@ def main():
instance_key = entry.get("instance_key", "name")
for item in items:
inst_name = item.get(instance_key, "")
if inst_name:
collection_instances.append((entry["id"], inst_name, entry))
if not inst_name:
continue
if not _item_matches_engine_filter(entry, item):
continue
collection_instances.append((entry["id"], inst_name, entry))

# -- Pre-compute collect expressions --
access_policies = collect_access_policies(collection_instances)
Expand Down Expand Up @@ -649,6 +671,8 @@ def main():
inst_name = item.get(instance_key, "")
if not inst_name:
continue
if not _item_matches_engine_filter(entry, item):
continue
hcl = _expand_collection_item(
entry, source, yaml_data, env_vars, ref_resolver, item, inst_name,
mod_vars,
Expand All @@ -670,8 +694,9 @@ def main():
write_file(
os.path.join(tf_root, "providers.tf"),
PROVIDERS_TEMPLATE.format(
region=region, project=PROJECT,
region=region,
service=service, team=app_team,
repo=os.environ.get("GITHUB_REPOSITORY", f"javaBin/{service}"),
),
)
write_file(
Expand Down
Loading