Clean up direct-engine state entries for remotely deleted resources removed from config by simonfaltum · Pull Request #5496 · databricks/cli

simonfaltum · 2026-06-09T19:58:55Z

Why

Found during a full-repo review of the CLI. With the direct engine, a resource that was deleted remotely and then removed from the bundle config could never leave the deployment state. The plan dropped its Delete entry, so the state cleanup never ran and the stale entry came back on every deploy. For pipelines and dashboards the same scenario was worse: the deploy hard-failed in metadata computation with "failed to compute relative path", and since the state never got cleaned, every following deploy failed the same way.

Changes

Before, deploying after a resource was deleted remotely and removed from config either left a stale state entry forever (jobs) or failed the deploy outright (pipelines, dashboards); now the deploy plans a delete for the stale entry, cleans it from state, and completes.

bundle/direct/bundle_plan.go: when planning a Delete and the remote read reports the resource is gone, keep the Delete entry instead of removing it from the plan. Delete() already tolerates missing resources and removes the entry from state when applied.
bundle/deployplan/plan.go: remove RemoveEntry, whose only caller was the code path above.
bundle/deploy/metadata/compute.go: add the missing l.File == "" guard to the pipelines and dashboards loops, matching the guard the jobs loop received in direct: handle remotely deleted resources #3710. State-only resources have no config location, and filepath.Rel errors on an empty path.

A side effect visible in test output: bundle destroy with the direct engine now lists a remotely deleted resource under "The following resources will be deleted", since its state entry is now explicitly cleaned instead of silently dropped.

Test plan

New acceptance test acceptance/bundle/resources/jobs/remote_delete/removed_from_config covering deploy, remote delete of a job and a pipeline, config removal, deploy twice, and summary, for both engines. With the direct engine the first deploy cleans the stale entries and the second deploy and summary are clean. The terraform variant also exercises the new metadata guard, since terraform keeps the gone resources in its state.
New unit test TestComputeMetadataMutatorStateOnlyResources verifying state-only jobs, pipelines, and dashboards are skipped without error.
Unit tests for bundle/direct, bundle/deployplan, and bundle/deploy/metadata.
Existing acceptance tests around the changed paths (jobs/remote_delete, volumes/remote-delete, volumes/remote-change-name, vector_search_indexes/drift, permissions, deployment/bind, deployment/unbind, bundle/destroy, bundle/generate, bundle/invariant, clusters, quality_monitors, registered_models, synced tables). Only jobs/remote_delete/destroy changed output (direct destroy now lists the gone job) and was regenerated with -update.
./task fmt-q, ./task lint-q, ./task checks.

This pull request and its description were written by Isaac.

… from config When a resource was deleted remotely and also removed from the bundle config, CalculatePlan dropped the Delete entry from the plan, so the stale state entry was never removed and reappeared on every deploy. Keep the Delete entry instead; Delete() already tolerates resources that are gone and removes the entry from state. Also add the missing location guard in metadata.Compute for pipelines and dashboards (jobs got it in #3710), which made the same scenario hard-fail the deploy. Co-authored-by: Isaac

github-actions · 2026-06-09T20:00:24Z

Approval status: pending

`/acceptance/bundle/` - needs approval

13 files changed
Suggested: @denik
Also eligible: @janniklasrose, @pietern, @shreyas-goenka, @lennartkats-db, @andrewnester, @anton-107

`/bundle/` - needs approval

8 files changed
Suggested: @denik
Also eligible: @janniklasrose, @pietern, @shreyas-goenka, @lennartkats-db, @andrewnester, @anton-107

_{Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @renaudhartert-db) can approve all areas.

See OWNERS for ownership rules.}

eng-dev-ecosystem-bot · 2026-06-09T20:51:59Z

Integration test report

Commit: 232ca17

Run: 27410895224

	Env	🟨KNOWN	🔄flaky	💚RECOVERED	🙈SKIP	✅pass	🙈skip	Time
💚	aws linux			7	15	264	974	9:27
🟨	aws windows	7			15	266	972	16:33
🔄	aws-ucws linux		4	7	15	356	888	28:30
💚	aws-ucws windows			7	15	362	886	16:22
💚	azure linux			1	17	267	972	9:43
💚	azure windows			1	17	269	970	12:20
🔄	azure-ucws linux		1	1	17	364	884	24:37
💚	azure-ucws windows			1	17	367	882	12:42
💚	gcp linux			1	17	263	975	12:38
💚	gcp windows			1	17	265	973	12:10

27 interesting tests: 15 SKIP, 7 KNOWN, 5 flaky

	Test Name	aws linux	aws windows	aws-ucws linux	aws-ucws windows	azure linux	azure windows	azure-ucws linux	azure-ucws windows	gcp linux	gcp windows
🟨	TestAccept	💚R	🟨K	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R
🙈	TestAccept/bundle/invariant/no_drift	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🔄	TestAccept/bundle/resources/model_serving_endpoints/basic	🙈s	🙈s	🔄f	✅p	🙈s	🙈s	✅p	✅p	🙈s	🙈s
🔄	TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=direct			🔄f	✅p			✅p	✅p
🙈	TestAccept/bundle/resources/permissions	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🔄	TestAccept/bundle/resources/permissions/dashboards/create	✅p	✅p	🔄f	✅p	✅p	✅p	✅p	✅p	🙈s	🙈s
🔄	TestAccept/bundle/resources/permissions/dashboards/create/DATABRICKS_BUNDLE_ENGINE=direct	✅p	✅p	🔄f	✅p	✅p	✅p	✅p	✅p
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions	💚R	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct	💚R	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	💚R	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions	💚R	🟨K	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct	💚R	🟨K	💚R	💚R
🟨	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	💚R	🟨K	💚R	💚R
🙈	TestAccept/bundle/resources/postgres_branches/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/replace_existing	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/update_protected	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/without_branch_id	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_projects/update_display_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/synced_database_tables/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/grants/select	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/ssh/connection	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🔄	TestSyncIncrementalSyncFileToPythonNotebook	✅p	✅p	✅p	✅p	✅p	✅p	🔄f	✅p	✅p	✅p

Top 50 slowest tests (at least 2 minutes):

duration	env	testname
7:27	azure windows	TestAccept
6:12	gcp windows	TestAccept
5:48	aws-ucws windows	TestAccept
5:34	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
5:15	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
5:09	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
5:03	azure-ucws windows	TestAccept
5:01	azure-ucws linux	TestAccept/bundle/deploy/files/no-snapshot-sync/DATABRICKS_BUNDLE_ENGINE=direct
4:57	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:49	aws-ucws linux	TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=direct
4:42	azure-ucws linux	TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=direct
4:42	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:38	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:29	azure-ucws linux	TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=terraform
4:18	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:12	aws-ucws linux	TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=terraform
4:03	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:53	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:44	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:41	aws-ucws linux	TestAccept/bundle/deploy/files/no-snapshot-sync/DATABRICKS_BUNDLE_ENGINE=direct
3:34	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:31	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:24	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:23	aws-ucws linux	TestAccept/bundle/deployment/bind/job/generate-and-bind/DATABRICKS_BUNDLE_ENGINE=direct
3:22	aws-ucws linux	TestAccept/bundle/deploy/files/no-snapshot-sync/DATABRICKS_BUNDLE_ENGINE=terraform
3:14	azure-ucws linux	TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=direct
3:11	azure-ucws linux	TestAccept/bundle/deployment/bind/job/generate-and-bind/DATABRICKS_BUNDLE_ENGINE=terraform
3:10	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:04	aws-ucws linux	TestAccept/bundle/destroy/jobs-and-pipeline/DATABRICKS_BUNDLE_ENGINE=terraform
3:02	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:55	aws-ucws linux	TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=terraform
2:53	azure linux	TestAccept
2:46	aws linux	TestAccept
2:45	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:44	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:44	azure-ucws linux	TestAccept
2:43	azure-ucws linux	TestAccept/bundle/resources/permissions/dashboards/create/DATABRICKS_BUNDLE_ENGINE=direct
2:42	gcp linux	TestAccept
2:40	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:36	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:35	azure-ucws linux	TestAccept/bundle/deploy/mlops-stacks/DATABRICKS_BUNDLE_ENGINE=terraform
2:32	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:31	aws-ucws linux	TestAccept/bundle/resources/permissions/dashboards/create/DATABRICKS_BUNDLE_ENGINE=direct
2:28	azure-ucws linux	TestAccept/bundle/deployment/bind/job/generate-and-bind/DATABRICKS_BUNDLE_ENGINE=direct
2:28	aws-ucws linux	TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=direct
2:26	azure-ucws linux	TestAccept/bundle/run_as/job_default/DATABRICKS_BUNDLE_ENGINE=direct
2:26	azure-ucws linux	TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=terraform
2:25	aws-ucws linux	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct
2:22	azure-ucws linux	TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=direct
2:14	aws-ucws linux	TestAccept/bundle/deploy/mlops-stacks/DATABRICKS_BUNDLE_ENGINE=direct

Co-authored-by: Isaac

simonfaltum · 2026-06-10T05:26:25Z

The direct engine now keeps the delete entry for a remotely deleted resource so its stale state entry gets cleaned up during destroy, which makes the destroy output of acceptance/bundle/resources/dashboards/delete-trashed-out-of-band legitimately diverge per engine. Split that section into out.destroy.terraform.txt and out.destroy.direct.txt per the repo convention, same as the existing jobs/remote_delete/destroy test.

denik

Nice find, thanks!

denik · 2026-06-10T09:26:29Z

 		// root
 		l := b.Config.GetLocation("resources.pipelines." + name)
+		if l.File == "" {
+			// b.Config.Resources.Pipelines may include a pipeline that only exists in state but not in config


I don't understand this comment, we iterate over a config there, so pipeline in question must exist in config.

Good catch, the wording was confusing. At this point b.Config.Resources is not only user config: deployCore runs statemgmt.Load right before metadata.Compute, and StateToBundle injects resources that exist only in state (just id and modified_status, no file location). The terraform variant of the removed_from_config test hits this, since terraform keeps the gone resources in its state. Same guard the jobs loop got in #3710. Reworded the comments in a01eb5d to explain this.

denik · 2026-06-10T09:30:15Z

-					// This is not an error during deletion, so don't return false here
-				}
+			// If the resource is already gone remotely, keep the Delete entry: Delete()
+			// tolerates missing resources and applying the entry removes it from the state.


I'd like to avoid unnecessary call though, for 2 reasons: 1) performance and 2) to reduce chance of failing there because of unexpected error.

Perhaps we should pass this info in plan entry to deploy. E.g. set gone: true on the entry (we cannot use remote == nil since it will also be nil in case of read error and in that case we do want to call delete).

Related:

A side effect visible in test output: bundle destroy with the direct engine now lists a remotely deleted resource under "The following resources will be deleted", since its state entry is now explicitly cleaned instead of silently dropped.

This is not great because the message is not correct anymore (we know delete is going to be nop-op). If we track it like I suggest above, we can filter out such plan entries from the warning.

Done in a01eb5d. Delete entries now get gone: true when the planning read returns 404, and apply then only removes the entry from state without calling delete. Read errors still leave gone unset, so delete is attempted as before. The removed_from_config test now also asserts that no delete API calls are made for the gone resources.

Fixed in a01eb5d. Deploy and destroy approvals now filter out gone entries, so they are not listed as deletions, do not trigger approval prompts, and do not trip prevent_destroy. With that the destroy output no longer diverges per engine, so I reverted the out.destroy split for delete-trashed-out-of-band.

Per review feedback: planning now sets gone on a Delete entry when the remote read returns 404. Applying such an entry removes it from state without calling the delete API, and deploy/destroy approval prompts no longer list it as a deletion or ask for approval, since nothing is actually deleted. With that the destroy output for remotely deleted resources matches the terraform engine again, so the per-engine out.destroy split is reverted for delete-trashed-out-of-band and the removed_from_config test now also asserts that no delete API calls are made for gone resources. Also reword the state-only resource comments in metadata.Compute to explain the statemgmt.Load injection that makes the guards necessary. Co-authored-by: Isaac

simonfaltum · 2026-06-10T16:44:54Z

Addressed the review feedback in a01eb5d:

Delete entries for resources that no longer exist remotely are now marked gone: true in the plan. Apply removes them from state without calling the delete API, and deploy/destroy approvals skip them (no deletion listing, no approval prompt, no prevent_destroy error), since nothing is actually deleted.
With gone entries filtered from the destroy listing, the destroy output for delete-trashed-out-of-band no longer diverges per engine, so the out.destroy.<engine>.txt split from the earlier revision is reverted, and the jobs/remote_delete/destroy output is back to what main has.
removed_from_config now records requests and asserts that no delete API calls are made for the gone resources.
Reworded the state-only resource comments in metadata.Compute to explain the statemgmt.Load injection that makes the guards necessary.

Validation: ./task fmt-q, ./task lint-q, ./task checks, unit tests for bundle/deployplan, bundle/direct, bundle/deploy/metadata, bundle/phases, and the full TestAccept/bundle acceptance suite pass locally.

simonfaltum requested a review from denik June 9, 2026 19:59

simonfaltum temporarily deployed to test-trigger-is June 9, 2026 19:59 — with GitHub Actions Inactive

Remove redundant comments

941baf9

Co-authored-by: Isaac

simonfaltum temporarily deployed to test-trigger-is June 9, 2026 21:43 — with GitHub Actions Inactive

Record direct-engine output for delete-trashed-out-of-band

24c3538

Co-authored-by: Isaac

simonfaltum temporarily deployed to test-trigger-is June 10, 2026 05:26 — with GitHub Actions Inactive

denik reviewed Jun 10, 2026

View reviewed changes

simonfaltum temporarily deployed to test-trigger-is June 10, 2026 16:44 — with GitHub Actions Inactive

simonfaltum requested a review from denik June 12, 2026 10:38

Merge branch 'main' into simonfaltum/b22-direct-stale-state

232ca17

simonfaltum temporarily deployed to test-trigger-is June 12, 2026 10:45 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up direct-engine state entries for remotely deleted resources removed from config#5496

Clean up direct-engine state entries for remotely deleted resources removed from config#5496
simonfaltum wants to merge 5 commits into
mainfrom
simonfaltum/b22-direct-stale-state

simonfaltum commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

eng-dev-ecosystem-bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

simonfaltum commented Jun 10, 2026

Uh oh!

denik left a comment

Uh oh!

denik Jun 10, 2026

Uh oh!

simonfaltum Jun 10, 2026

Uh oh!

denik Jun 10, 2026

Uh oh!

denik Jun 10, 2026

Uh oh!

simonfaltum Jun 10, 2026

Uh oh!

simonfaltum Jun 10, 2026

Uh oh!

simonfaltum commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

simonfaltum commented Jun 9, 2026

Why

Changes

Test plan

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approval status: pending

/acceptance/bundle/ - needs approval

/bundle/ - needs approval

Uh oh!

eng-dev-ecosystem-bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration test report

Uh oh!

simonfaltum commented Jun 10, 2026

Uh oh!

denik left a comment

Choose a reason for hiding this comment

Uh oh!

denik Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

simonfaltum Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

denik Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

denik Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

simonfaltum Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

simonfaltum Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

simonfaltum commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 9, 2026 •

edited

Loading

`/acceptance/bundle/` - needs approval

`/bundle/` - needs approval

eng-dev-ecosystem-bot commented Jun 9, 2026 •

edited

Loading