Skip to content

Commit bfe0a58

Browse files
committed
Add --select flag to bundle config-remote-sync
Add an opt-in --select flag to the experimental `bundle config-remote-sync` command that restricts detected and saved changes to specific resources. Each selector is "<type>:<id>" (e.g. jobs:123456789), using the resource type and its deployed resource ID — the pair the workspace UI knows from a resource's page, which has the ID but not the bundle "type.name" key. The type is required because a resource ID is only unique within a type, so an ID that collides across types would otherwise select the wrong resource. Selection is therefore independent from `bundle deploy --select`, which matches keys. Selectors are resolved to plan keys against the deployment state after planning, so ${resources.*} references still resolve; only the emitted change set is restricted. A selector that matches no deployed resource is an error. Default behavior without the flag is unchanged. This limits the blast radius of a sync run: syncing one resource can no longer rewrite an unrelated drifted resource's configuration.
1 parent e8b6f70 commit bfe0a58

14 files changed

Lines changed: 567 additions & 18 deletions

File tree

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
bundle:
2+
name: test-bundle-$UNIQUE_NAME
3+
4+
resources:
5+
jobs:
6+
job_one:
7+
max_concurrent_runs: 1
8+
tasks:
9+
- task_key: main
10+
notebook_task:
11+
notebook_path: /Users/{{workspace_user_name}}/job1
12+
new_cluster:
13+
spark_version: $DEFAULT_SPARK_VERSION
14+
node_type_id: $NODE_TYPE_ID
15+
num_workers: 1
16+
17+
job_two:
18+
max_concurrent_runs: 2
19+
tasks:
20+
- task_key: main
21+
notebook_task:
22+
notebook_path: /Users/{{workspace_user_name}}/job2
23+
new_cluster:
24+
spark_version: $DEFAULT_SPARK_VERSION
25+
node_type_id: $NODE_TYPE_ID
26+
num_workers: 1
27+
28+
targets:
29+
default:
30+
mode: development

acceptance/bundle/config-remote-sync/select_basic/out.test.toml

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/test-bundle-[UNIQUE_NAME]/default/files...
2+
Deploying resources...
3+
Updating deployment state...
4+
Deployment complete!
5+
6+
=== Modify both jobs remotely
7+
=== Sync only job_one, selected by its type and deployed resource id
8+
Detected changes in 1 resource(s):
9+
10+
Resource: resources.jobs.job_one
11+
max_concurrent_runs: replace
12+
13+
14+
15+
=== Only job_one is updated; job_two is left untouched
16+
17+
>>> diff.py databricks.yml.backup databricks.yml
18+
--- databricks.yml.backup
19+
+++ databricks.yml
20+
@@ -5,5 +5,5 @@
21+
jobs:
22+
job_one:
23+
- max_concurrent_runs: 1
24+
+ max_concurrent_runs: 5
25+
tasks:
26+
- task_key: main
27+
28+
=== Selecting job_one again is idempotent
29+
No changes detected.
30+
31+
32+
=== Unfiltered sync still detects the job_two drift (no lost updates)
33+
Detected changes in 1 resource(s):
34+
35+
Resource: resources.jobs.job_two
36+
max_concurrent_runs: replace
37+
38+
39+
40+
=== An unknown resource id is rejected
41+
>>> [CLI] bundle config-remote-sync --select jobs:no-such-id-123
42+
Error: no deployed jobs resource with id no-such-id-123
43+
44+
Exit code: 1
45+
46+
=== A selector without a type is rejected
47+
>>> [CLI] bundle config-remote-sync --select no-such-id-123
48+
Error: invalid --select value "no-such-id-123", expected <type>:<id> (e.g. jobs:[NUMID])
49+
50+
Exit code: 1
51+
52+
=== An id that exists under a different type is rejected (no cross-type collision)
53+
>>> [CLI] bundle config-remote-sync --select pipelines:[JOB_ONE_ID]
54+
Error: no deployed pipelines resource with id [JOB_ONE_ID]
55+
56+
Exit code: 1
57+
58+
>>> [CLI] bundle destroy --auto-approve
59+
The following resources will be deleted:
60+
delete resources.jobs.job_one
61+
delete resources.jobs.job_two
62+
63+
All files and directories at the following location will be deleted: /Workspace/Users/[USERNAME]/.bundle/test-bundle-[UNIQUE_NAME]/default
64+
65+
Deleting files...
66+
Destroy complete!
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#!/bin/bash
2+
3+
envsubst < databricks.yml.tmpl > databricks.yml
4+
5+
cleanup() {
6+
trace $CLI bundle destroy --auto-approve
7+
}
8+
trap cleanup EXIT
9+
10+
$CLI bundle deploy
11+
job_one_id="$(read_id.py job_one)"
12+
job_two_id="$(read_id.py job_two)"
13+
14+
title "Modify both jobs remotely"
15+
edit_resource.py jobs $job_one_id <<EOF
16+
r["max_concurrent_runs"] = 5
17+
EOF
18+
19+
edit_resource.py jobs $job_two_id <<EOF
20+
r["max_concurrent_runs"] = 10
21+
EOF
22+
23+
title "Sync only job_one, selected by its type and deployed resource id"
24+
echo
25+
cp databricks.yml databricks.yml.backup
26+
$CLI bundle config-remote-sync --select "jobs:$job_one_id" --save
27+
28+
title "Only job_one is updated; job_two is left untouched"
29+
echo
30+
trace diff.py databricks.yml.backup databricks.yml
31+
rm databricks.yml.backup
32+
33+
title "Selecting job_one again is idempotent"
34+
echo
35+
$CLI bundle config-remote-sync --select "jobs:$job_one_id"
36+
37+
title "Unfiltered sync still detects the job_two drift (no lost updates)"
38+
echo
39+
$CLI bundle config-remote-sync
40+
41+
title "An unknown resource id is rejected"
42+
errcode trace $CLI bundle config-remote-sync --select jobs:no-such-id-123
43+
44+
title "A selector without a type is rejected"
45+
errcode trace $CLI bundle config-remote-sync --select no-such-id-123
46+
47+
title "An id that exists under a different type is rejected (no cross-type collision)"
48+
errcode trace $CLI bundle config-remote-sync --select "pipelines:$job_one_id"
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
Cloud = true
2+
3+
RecordRequests = false
4+
Ignore = [".databricks", "databricks.yml", "databricks.yml.backup"]
5+
6+
[Env]
7+
DATABRICKS_BUNDLE_ENABLE_EXPERIMENTAL_YAML_SYNC = "true"
8+
9+
[EnvMatrix]
10+
DATABRICKS_BUNDLE_ENGINE = ["direct", "terraform"]
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
bundle:
2+
name: test-bundle-$UNIQUE_NAME
3+
4+
resources:
5+
jobs:
6+
job_a:
7+
max_concurrent_runs: 1
8+
tasks:
9+
- task_key: main
10+
notebook_task:
11+
notebook_path: /Users/{{workspace_user_name}}/job_a
12+
new_cluster:
13+
spark_version: $DEFAULT_SPARK_VERSION
14+
node_type_id: $NODE_TYPE_ID
15+
num_workers: 1
16+
17+
job_b:
18+
max_concurrent_runs: 1
19+
tasks:
20+
- task_key: main
21+
notebook_task:
22+
notebook_path: /Users/{{workspace_user_name}}/job_b
23+
new_cluster:
24+
spark_version: $DEFAULT_SPARK_VERSION
25+
node_type_id: $NODE_TYPE_ID
26+
num_workers: 1
27+
28+
job_c:
29+
max_concurrent_runs: 1
30+
tasks:
31+
- task_key: main
32+
notebook_task:
33+
notebook_path: /Users/{{workspace_user_name}}/job_c
34+
new_cluster:
35+
spark_version: $DEFAULT_SPARK_VERSION
36+
node_type_id: $NODE_TYPE_ID
37+
num_workers: 1
38+
39+
targets:
40+
default:
41+
mode: development

acceptance/bundle/config-remote-sync/select_multiple/out.test.toml

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/test-bundle-[UNIQUE_NAME]/default/files...
2+
Deploying resources...
3+
Updating deployment state...
4+
Deployment complete!
5+
6+
=== Modify all three jobs remotely
7+
=== Comma-separated selectors preview two of three resources
8+
Detected changes in 2 resource(s):
9+
10+
Resource: resources.jobs.job_a
11+
max_concurrent_runs: replace
12+
13+
Resource: resources.jobs.job_b
14+
max_concurrent_runs: replace
15+
16+
17+
18+
=== Repeating the same selector dedupes silently
19+
Detected changes in 1 resource(s):
20+
21+
Resource: resources.jobs.job_c
22+
max_concurrent_runs: replace
23+
24+
25+
26+
=== Save with repeated --select flags
27+
Detected changes in 2 resource(s):
28+
29+
Resource: resources.jobs.job_a
30+
max_concurrent_runs: replace
31+
32+
Resource: resources.jobs.job_b
33+
max_concurrent_runs: replace
34+
35+
36+
37+
=== job_a and job_b are updated, job_c is untouched
38+
39+
>>> diff.py databricks.yml.backup databricks.yml
40+
--- databricks.yml.backup
41+
+++ databricks.yml
42+
@@ -5,5 +5,5 @@
43+
jobs:
44+
job_a:
45+
- max_concurrent_runs: 1
46+
+ max_concurrent_runs: 5
47+
tasks:
48+
- task_key: main
49+
@@ -16,5 +16,5 @@
50+
51+
job_b:
52+
- max_concurrent_runs: 1
53+
+ max_concurrent_runs: 5
54+
tasks:
55+
- task_key: main
56+
57+
=== Unfiltered sync still detects the job_c drift
58+
Detected changes in 1 resource(s):
59+
60+
Resource: resources.jobs.job_c
61+
max_concurrent_runs: replace
62+
63+
64+
65+
>>> [CLI] bundle destroy --auto-approve
66+
The following resources will be deleted:
67+
delete resources.jobs.job_a
68+
delete resources.jobs.job_b
69+
delete resources.jobs.job_c
70+
71+
All files and directories at the following location will be deleted: /Workspace/Users/[USERNAME]/.bundle/test-bundle-[UNIQUE_NAME]/default
72+
73+
Deleting files...
74+
Destroy complete!
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
#!/bin/bash
2+
3+
envsubst < databricks.yml.tmpl > databricks.yml
4+
5+
cleanup() {
6+
trace $CLI bundle destroy --auto-approve
7+
}
8+
trap cleanup EXIT
9+
10+
$CLI bundle deploy
11+
job_a_id="$(read_id.py job_a)"
12+
job_b_id="$(read_id.py job_b)"
13+
job_c_id="$(read_id.py job_c)"
14+
15+
title "Modify all three jobs remotely"
16+
for id in $job_a_id $job_b_id $job_c_id; do
17+
edit_resource.py jobs $id <<EOF
18+
r["max_concurrent_runs"] = 5
19+
EOF
20+
done
21+
22+
title "Comma-separated selectors preview two of three resources"
23+
echo
24+
$CLI bundle config-remote-sync --select "jobs:$job_a_id,jobs:$job_b_id"
25+
26+
title "Repeating the same selector dedupes silently"
27+
echo
28+
$CLI bundle config-remote-sync --select "jobs:$job_c_id,jobs:$job_c_id"
29+
30+
title "Save with repeated --select flags"
31+
echo
32+
cp databricks.yml databricks.yml.backup
33+
$CLI bundle config-remote-sync --select "jobs:$job_a_id" --select "jobs:$job_b_id" --save
34+
35+
title "job_a and job_b are updated, job_c is untouched"
36+
echo
37+
trace diff.py databricks.yml.backup databricks.yml
38+
rm databricks.yml.backup
39+
40+
title "Unfiltered sync still detects the job_c drift"
41+
echo
42+
$CLI bundle config-remote-sync
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
Cloud = true
2+
3+
RecordRequests = false
4+
Ignore = [".databricks", "databricks.yml", "databricks.yml.backup"]
5+
6+
[Env]
7+
DATABRICKS_BUNDLE_ENABLE_EXPERIMENTAL_YAML_SYNC = "true"
8+
9+
[EnvMatrix]
10+
DATABRICKS_BUNDLE_ENGINE = ["direct", "terraform"]

0 commit comments

Comments
 (0)