Skip to content

feat(gcp): refactor GCP modules into composer + submodules (full)#233

Draft
micheledaddetta-databricks wants to merge 19 commits into
mainfrom
feature/gcp-modules-refactor
Draft

feat(gcp): refactor GCP modules into composer + submodules (full)#233
micheledaddetta-databricks wants to merge 19 commits into
mainfrom
feature/gcp-modules-refactor

Conversation

@micheledaddetta-databricks

@micheledaddetta-databricks micheledaddetta-databricks commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Full GCP modules refactor described in docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md.

The repo previously shipped three duplicated GCP workspace modules. Each example wrapped its own dedicated module. A change to a shared piece — e.g. a new GCP region added to the regional PSC service-attachment map, or a workspace argument added by the provider — needed to land in 2–3 places.

This PR replaces that structure with a unified composer at modules/gcp/databricks-workspace that takes scenario flags as inputs and conditionally instantiates focused submodules:

  • modules/gcp/network — VPC + subnet + router + NAT + peering + shared-VPC (create or existing)
  • modules/gcp/private-connectivity — PSC subnet + endpoints + egress firewall stack
  • modules/gcp/account — all databricks_mws_* resources (workspaces, networks, vpc_endpoint, private_access_settings)
  • modules/gcp/dns — private DNS zones for hub + spoke (gcp.databricks.com, gcr.io, googleapis.com, pkg.dev)

Existing modules/gcp-sa-provisioning and modules/gcp-unity-catalog are relocated under modules/gcp/ (no functional changes).

Architecture

Dependency graph is linear: network → private-connectivity → account → dns. All databricks_mws_* resources colocated in account; databricks_mws_vpc_endpoint references PSC forwarding rules from private-connectivity; dns consumes account.workspace_url.

Composer enforces 6 cross-variable preconditions via null_resource.preconditions:

Rule Reason
restricted_egressvpc_source="create" Hub-spoke + egress firewall + private DNS require us to own both VPCs
restricted_egress ⇒ at least one PrivateLink flag Egress-restricted workspace without PSC is unreachable
restricted_egresshub_vpc_google_project + hub_vpc_cidr + psc_subnet_cidr set Hub topology needs these
vpc_source="create"spoke_vpc_cidr + subnet_cidr set Need CIDRs
vpc_source="existing"existing_vpc_name + existing_subnet_name set Need names to look up
vpc_source="databricks_managed" ⇒ all PrivateLink/egress flags false Can't attach PSC to a VPC we don't own

Example migrations

Example Before After
examples/gcp-basic modules/gcp-workspace-basic composer with vpc_source="databricks_managed"
examples/gcp-byovpc modules/gcp-workspace-byovpc composer with vpc_source="create" + CIDRs
examples/gcp-with-psc-exfiltration-protection modules/gcp-with-psc-exfiltration-protection + modules/gcp-unity-catalog composer with vpc_source="create" + 4 connectivity flags + relocated UC
examples/gcp-existing-vpc (NEW) composer with vpc_source="existing"
examples/gcp-sa-provisioning github URL source ../../modules/gcp/service-account

State from old applies does NOT migrate cleanly to the new composer because resource addresses differ. Each example's README documents this. Re-apply on clean state.

Commits

  1. docs: add design spec for GCP modules refactor
  2. docs: add implementation plan for GCP modules refactor
  3. build: add Makefile recursion for modules/gcp/ submodules
  4. feat(gcp/network): VPC create/existing/hub + fixtures
  5. feat(gcp/private-connectivity): PSC + firewall + fixtures
  6. feat(gcp/account): mws_* resources + fixtures
  7. feat(gcp/dns): hub + spoke private zones + fixture
  8. feat(gcp/databricks-workspace): composer + preconditions + fixtures
  9. refactor(gcp/service-account): relocate from gcp-sa-provisioning
  10. refactor(gcp/unity-catalog): relocate from gcp-unity-catalog
  11. docs(gcp): terraform-docs READMEs for new submodules
  12. refactor(examples/gcp): migrate to new composer + add existing-vpc
  13. chore(gcp): remove deprecated modules and junk directories
  14. docs: refresh top-level README for new GCP layout

Test plan

  • terraform validate passes for every submodule and the composer
  • 4 positive fixtures under modules/gcp/databricks-workspace/tests/ validate (basic, byovpc, existing-vpc, psc-isolated)
  • 4 negative fixtures fail terraform plan with the expected precondition error
  • Per-submodule fixtures (tests/<scenario>/) under each submodule plan with expected resource counts
  • make -C modules/gcp docs regenerated all module READMEs; no drift outside modules/gcp/
  • Every migrated example passes terraform validate
  • Sandbox terraform apply against the migrated PSC example (medium-risk, manual verification before merge)
  • Sandbox apply against gcp-basic and gcp-byovpc (smoke)

Spec + plan

  • Design spec: docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md
  • Implementation plan: docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md

Captures the brainstormed design for refactoring GCP-related Terraform
examples and modules: a single composer module under modules/gcp/ that
conditionally instantiates submodules (network, private-connectivity,
account, dns), with each example becoming a thin caller that varies only
scenario inputs.

Co-authored-by: Isaac
31 tasks across 6 PRs implementing the design spec. Each task has
explicit file paths, code blocks, validation commands, and commit
messages. Self-review checks complete.

Co-authored-by: Isaac
Adds modules/gcp/Makefile mirroring the modules/ pattern (discover
sub-projects via */README.md) and updates modules/Makefile to recurse
into the gcp/ subdir for terraform-docs generation.

Co-authored-by: Isaac
Adds modules/gcp/network module supporting three VPC provenance modes:
- vpc_source="create": Terraform creates spoke VPC + subnet + Cloud Router + NAT
- vpc_source="existing": data-source lookup for pre-existing VPC + subnet
- create_hub=true: adds hub VPC + subnet + bidirectional peering + optional
  shared-VPC host/service binding

Outputs cover all spoke and hub identifiers. Test fixtures in tests/
cover create, existing, and create-with-hub scenarios.

Co-authored-by: Isaac
Adds modules/gcp/private-connectivity with:
- Dedicated PSC subnet in spoke VPC
- Backend (SCC) PSC endpoint gated on enable_backend
- Frontend PSC endpoint (spoke) gated on enable_frontend
- Hub-side frontend PSC endpoint when hub present + frontend enabled
- Regional PSC service-attachment maps for 14 GCP regions
- Egress firewall stack gated on restrict_egress: deny-egress (priority
  1100), allow Google APIs, allow Databricks control plane (to PSC IPs),
  optional allow managed Hive, hub ingress from spoke CIDR

Test fixtures: full-isolated and no-egress.

Co-authored-by: Isaac
Adds modules/gcp/account housing all databricks_mws_* resources:
- databricks_mws_workspaces (always emitted)
- databricks_mws_networks (when vpc_source != databricks_managed),
  with dynamic vpc_endpoints block populated when both PSC endpoint
  IDs are present
- databricks_mws_vpc_endpoint frontend/backend/transit, each gated
  on its enable_* flag plus presence of the matching forwarding-rule
  name from private-connectivity
- databricks_mws_private_access_settings (when private_access_only)

Test fixtures: databricks-managed, byovpc, psc-with-pas.

Co-authored-by: Isaac
Adds modules/gcp/dns for restricted-egress private DNS:
- Hub zones: gcp.databricks.com, gcr.io, googleapis.com, pkg.dev,
  with workspace/psc-auth/dp records pointing at hub PSC IP
- Spoke zone: gcp.databricks.com with workspace/dp records pointing
  at spoke frontend PSC IP, and tunnel record at backend PSC IP
- workspace_dns_id extracted via regex from workspace_url

Split into its own submodule (rather than colocated with PSC) because
DNS depends on workspace_url which only exists after account creates
the workspace; this keeps the composer's dependency graph linear.

Test fixture: hub-and-spoke.

Co-authored-by: Isaac
Top-level composer that orchestrates network, private-connectivity,
account, and dns submodules through orthogonal feature flags:
- vpc_source: databricks_managed | create | existing
- private_link_frontend, private_link_backend, private_access_only,
  restricted_egress (each defaults false)
- hub/spoke project + CIDR vars required only when restricted_egress

Submodules instantiated conditionally:
- network when vpc_source != databricks_managed
- private-connectivity when any PrivateLink flag is true
- account always
- dns when restricted_egress is true

Cross-variable preconditions (null_resource.preconditions) enforce
six rules from the design spec, including: restricted_egress requires
vpc_source=create + at least one PrivateLink flag + hub/CIDR vars;
databricks_managed forbids any PrivateLink or restricted_egress.

Random suffix declared once in the composer, passed to each submodule.

Test fixtures: 4 positive (basic, byovpc, existing-vpc, psc-isolated)
and 4 negative (each violating one precondition rule).

Co-authored-by: Isaac
git mv only; no functional changes. Old path has a deprecation README
pointing to the new location. Makefile updated for new depth.

Co-authored-by: Isaac
git mv only; no functional changes. Old path has a deprecation README.

Co-authored-by: Isaac
Generated README content for network, private-connectivity, account,
dns, databricks-workspace, service-account, and unity-catalog via
`make -C modules/gcp docs`. Includes an added README placeholder for
unity-catalog (the original modules/gcp-unity-catalog had no README).

Co-authored-by: Isaac
Migrates all four existing GCP examples to call modules/gcp/databricks-workspace:
- gcp-basic: vpc_source="databricks_managed"
- gcp-byovpc: vpc_source="create" with spoke_vpc_cidr + subnet_cidr
- gcp-with-psc-exfiltration-protection: vpc_source="create" + all 4
  PrivateLink/egress flags + Unity Catalog wired via modules/gcp/unity-catalog
- gcp-sa-provisioning: source repointed from github URL to
  ../../modules/gcp/service-account

Adds a NEW example examples/gcp-existing-vpc demonstrating
vpc_source="existing" (data-source lookup of a pre-existing VPC + subnet).

Variable name changes documented in each example's README migration table:
subnet_ip_cidr_range -> subnet_cidr, pod/svc renames, subnet/router/nat_name
dropped (composer derives), delegate_from dropped (SA-provisioning concern).

State from old applies does NOT migrate cleanly because resource addresses
differ. Re-apply on clean state.

Co-authored-by: Isaac
Deletes:
- modules/gcp-workspace-basic, modules/gcp-workspace-byovpc,
  modules/gcp-with-psc-exfiltration-protection (replaced by the composer
  + submodules under modules/gcp/)
- modules/gcp-sa-provisioning, modules/gcp-unity-catalog deprecation stubs
  (relocated to modules/gcp/{service-account,unity-catalog} in PR 1's
  foundation commit)
- examples/gcp-sa-provisionning (typo dir, only had a Makefile)
- examples/gcp-test-modules (only contained terraform.tfstate files)

All examples now point at modules/gcp/*. Stray terraform.tfstate* files
inside remaining example dirs are gitignored and remain on disk untouched.

Co-authored-by: Isaac
Examples section now lists all four migrated examples (gcp-basic,
gcp-byovpc, gcp-with-psc-exfiltration-protection) plus the new
gcp-existing-vpc and the corrected-spelling gcp-sa-provisioning.

Modules section now lists the modules/gcp/ tree: the composer
(databricks-workspace) and its five submodules (network,
private-connectivity, account, dns, plus service-account and
unity-catalog).

Co-authored-by: Isaac
@micheledaddetta-databricks micheledaddetta-databricks changed the title feat(gcp): add modules/gcp/ composer + submodules (PR 1 of 6) feat(gcp): refactor GCP modules into composer + submodules (full) May 22, 2026
Plans a polish pass on the GCP modules and examples landed in PR #233:
- fill ~70 missing variable descriptions
- split large main.tf files by concern
- standardize versions.tf placement
- expand composer outputs, drop redundancy, rename misnomers
- add critical-only region validations
- add README usage examples

Co-authored-by: Isaac
8 tasks corresponding to the 8 commits planned in the spec.
Each task is self-contained with exact file paths, code blocks,
validation commands, and commit messages.

Co-authored-by: Isaac
… outputs

Reorganizes network, account, dns, and databricks-workspace modules
so each .tf file has one clear responsibility, and fixes the
*_psc_fr_id misnomer for outputs/inputs that hold forwarding-rule
names (not IDs).

File splits (no behavioral change):
- network/main.tf -> vpc.tf, subnets.tf, nat.tf, peering.tf,
  shared-vpc.tf, data.tf, locals.tf
- account/main.tf -> workspace.tf, networks.tf, locals.tf
- dns/hub.tf: workspace_dns_id local extracted to dns/locals.tf
- databricks-workspace/main.tf -> main.tf (module blocks only),
  locals.tf, preconditions.tf, random.tf

Renames (coordinated across outputs, inputs, wiring, and fixture):
- private-connectivity outputs: frontend_psc_fr_id -> frontend_forwarding_rule_name,
  backend_psc_fr_id -> backend_forwarding_rule_name,
  hub_frontend_psc_fr_id -> hub_frontend_forwarding_rule_name
- account inputs renamed to match
- composer wiring and psc-with-pas fixture updated accordingly

Co-authored-by: Isaac
…ion validations

Three layered improvements to module surface area and documentation:

Composer outputs (drop redundancy, add downstream-friendly outputs):
- Drop vpc_id (was an alias of spoke_vpc_id)
- Add 14 outputs: private_access_settings_id, frontend/backend/transit
  _endpoint_id, spoke_vpc_self_link, spoke_subnet_id/self_link,
  hub_vpc_self_link, nat_id, frontend/backend_psc_ip_spoke,
  frontend_psc_ip_hub, google_region
- Replace try(module.network[0].*) with explicit local.databricks_managed
  / var.restricted_egress ternaries (same behavior, intent visible)
- Add private_access_settings_id output on the account module
- Update gcp-byovpc and gcp-with-psc examples to consume spoke_vpc_id
  instead of the removed vpc_id

Variable descriptions (~70 added):
- Every variable in private-connectivity, account, dns, and the
  databricks-workspace composer now has a description attribute
  describing purpose, format, nullability semantics, and required-when
  context
- Gap-fill on service-account / unity-catalog where missing

Region validation (clearer error messages):
- private-connectivity validates google_region unconditionally against
  the 14-region list backing the regional PSC service-attachment maps
- composer adds the same validation as a precondition, but gated on
  any private_link_* flag or restricted_egress being true; bad regions
  now fail plan with a clear message instead of a map-lookup error

Co-authored-by: Isaac
File organization (versions.tf everywhere, no provider configs in modules):
- service-account module: init.tf -> versions.tf + data.tf
  (no provider "google" {} block — caller's responsibility)
- unity-catalog module: terraform.tf -> versions.tf
- examples gcp-basic, gcp-byovpc, gcp-existing-vpc, gcp-sa-provisioning:
  init.tf -> versions.tf + providers.tf (+ data.tf where the original
  carried google_client_openid_userinfo / google_client_config)
- examples gcp-with-psc-exfiltration-protection: terraform.tf -> versions.tf
  (providers.tf already existed)

README usage sections (terraform-google-modules convention):
- Each module README gets a ## Usage HCL block above the terraform-docs
  marker showing a minimal calling example
- Submodule READMEs note "Typically called by modules/gcp/databricks-
  workspace (the composer)"
- The composer README points to the four scenario examples

terraform-docs regen:
- Auto-generated README sections refreshed to reflect Task-2 renames,
  Task-3 new outputs, Task-4 ~70 variable descriptions, and Task-5
  region validation block

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant