Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 33 additions & 6 deletions platform-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,15 @@ Seven Terraform sub-modules manage shared resources:
|--------|-----------|
| **networking** | VPC, public/private subnets across 3 AZs, NAT gateway, security groups |
| **ingress** | ALB, ACM wildcard certificate for `*.javazone.no`, Route53 DNS |
| **iam** | GitHub OIDC provider, per-app CI roles, permission boundary |
| **iam** | GitHub OIDC provider, per-team CI roles (ABAC), permission boundary |
| **compute** | ECS Fargate cluster (`javabin-platform`), ECR base config |
| **monitoring** | SNS topics, EventBridge rules, AWS Config, GuardDuty, Security Hub |
| **lambdas** | 6 Lambda functions for alerts, cost reporting, compliance, cleanup |
| **lambdas** | 11 Lambda functions for alerts, cost reporting, compliance, budget enforcement, resource tagging, team provisioning |
| **identity** | Cognito user pools (internal + external). Internal pool connected to Google IdP. Identity Center is in `terraform/org/` (deployed) |

### Reusable Modules (`terraform/modules/`)

Twelve Terraform modules that app repos source via `git::` URLs. The key one is `app-stack`, the golden path module that reads `app.yaml` and creates all infra for a service (ECR, ECS service, ALB routing, IAM role, optional S3/DynamoDB/SQS/Secrets Manager).
Twelve Terraform modules that app repos source via `git::` URLs. CI generates expanded Terraform from `app.yaml` using `expand-modules.py` + `registry.py`. Supported resources: ECR, ECS service, ALB routing, IAM role (configurable for ECS/EC2/Lambda), S3, DynamoDB, RDS PostgreSQL, SQS, Secrets Manager. Cross-service access is auto-wired via `access_policy_json` outputs.

### Reusable Workflows (`.github/workflows/`)

Expand All @@ -60,11 +60,16 @@ App repos call `javaBin/platform/.github/workflows/javabin.yml` as their CI entr
| Function | Trigger | Purpose |
|----------|---------|---------|
| `slack-alert` | SNS subscription | Routes security/cost events to Slack with LLM analysis |
| `cost-report` | Weekly schedule (Mon 08:00 UTC) | Cost breakdown with LLM narrative |
| `daily-cost-check` | Daily schedule (08:00 UTC) | Spike detection, silent if no anomalies |
| `cost-report` | Weekly schedule (Mon 08:00 UTC) | Cost breakdown with LLM narrative, per-team attribution |
| `daily-cost-check` | Daily schedule (08:00 UTC) | Spike detection with team breakdown, silent if no anomalies |
| `compliance-reporter` | EventBridge (resource create/run) | Reports untagged resources to Slack |
| `resource-tagger` | EventBridge (all AWS create/run) | Auto-tags created-by + commit from CI session names |
| `budget-enforcer` | SNS (AWS Budgets 200%) | Scales team's ECS services to zero, posts Slack alert |
| `override-cleanup` | Hourly schedule | Deletes stale SSM override tokens |
| `team-provisioner` | Registry merge | Syncs Google Groups, GitHub teams, AWS Budgets, Cognito, Identity Center. Also handles hero provisioning (Workspace accounts, aliases, group membership) |
| `team-provisioner` | Registry merge | Syncs Google Groups, GitHub teams, AWS Budgets, Cognito, Identity Center, hero provisioning |
| `apply-gate` | CI invocation | Credential broker for gated Terraform apply with risk verification |
| `securityhub-summary` | Weekly schedule (Mon 08:00 UTC) | HIGH/CRITICAL Security Hub findings summary |
| `password-set` | Function URL | Self-service password set for new hero accounts |

## How Apps Get CI/CD

Expand All @@ -91,6 +96,28 @@ The [registry](https://github.com/javaBin/registry) serves two purposes:

Changes to `groups/` trigger provisioning: Google Workspace account creation, group membership sync, email aliases, and Cognito/Identity Center sync where configured. Heroes are synced from a yearly Google Sheets application process.

## Tag Schema

Every AWS resource gets 7 tags — 5 static (Terraform-managed) and 2 dynamic (auto-applied by the resource-tagger Lambda):

| Tag | Source | Example | Purpose |
|-----|--------|---------|---------|
| `team` | app.yaml / default_tags | `web-team` | ABAC, cost attribution, budgets |
| `service` | app.yaml / default_tags | `moresleep` | Cost breakdown within team |
| `repo` | app.yaml / default_tags | `javaBin/moresleep` | Link resource to source code |
| `environment` | default_tags | `production` | Multi-env support |
| `managed-by` | default_tags | `terraform` | Distinguish TF vs console |
| `created-by` | resource-tagger Lambda | `alice` | Who created (set once) |
| `commit` | resource-tagger Lambda | `abc12345` | Which commit (set once) |

Cost allocation tags are activated in AWS, so Cost Explorer can group by `team` and `service`.

## Budget Enforcement

Teams get a monthly budget (default 500 NOK). Two thresholds:
- **80%** — SNS alert to #javabin-cost-alerts
- **200%** — `budget-enforcer` Lambda scales the team's ECS services to `desired_count=0` (not destroyed, easy recovery)

## AWS Account

- **Account**: (private — see platform repo)
Expand Down