#254 Increase ACI resources, tune RQ2, drop default strategy#339
Merged
jathavaan merged 4 commits intoMay 24, 2026
Merged
Conversation
…tegy Bump all ACI containers to 4 vCPU / 16 GB RAM (max for most Azure regions). Reduce national-scale spatial join iterations from 5 to 3 and raise per-iteration timeout to 90 minutes (including warmup). Remove Sedona default-strategy benchmarks — iterations consistently timed out or failed, making reliable measurement infeasible. Increase Databricks driver max result size to 16g. Update data release to 2026-05-23.1.
10 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates benchmark orchestration/config to increase ACI container resources, reduce/tune the RQ2 (national-scale spatial join) run length, and remove the infeasible Sedona “default strategy” experiments from the benchmark suite.
Changes:
- Increase ACI experiment resources in
benchmarks.yml(3 vCPU / 8 GB → 4 vCPU / 16 GB) and update the data release string. - Tune RQ2 execution by lowering
NATIONAL_SCALE_SPATIAL_JOINiterations (5 → 3) and extending the timeout ceiling (3600s → 5400s), including warmup enforcement. - Remove Databricks “default” national-scale spatial join entrypoints and dispatch/import wiring, and update docs to reflect the dropped variant.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
benchmarks.yml |
Bumps all experiment CPU/memory and removes default-strategy experiments from the matrix. |
src/config.py |
Updates data release, increases max timed window, and increases Databricks driver max result size. |
src/domain/enums/benchmark_iteration.py |
Lowers national-scale spatial join iteration ceiling from 5 to 3. |
src/application/common/monitor.py |
Applies timeout enforcement to warmup iterations and avoids entering timed loop if warmup times out. |
src/presentation/entrypoints/national_scale_spatial_join_databricks_default_2_nodes.py |
Deletes default-strategy 2-worker Databricks entrypoint. |
src/presentation/entrypoints/national_scale_spatial_join_databricks_default_8_nodes.py |
Deletes default-strategy 8-worker Databricks entrypoint. |
src/presentation/entrypoints/national_scale_spatial_join_databricks_default_16_nodes.py |
Deletes default-strategy 16-worker Databricks entrypoint. |
src/presentation/entrypoints/__init__.py |
Removes exports for deleted default-strategy Databricks entrypoints. |
benchmark_runner.py |
Removes imports and dispatch cases for deleted default-strategy Databricks script IDs. |
README.md |
Updates narrative and batch listing to reflect removal of default strategy and new timeout/iteration values. |
CLAUDE.md |
Updates stopping-rule rationale text to match the new 3-iteration RQ2 configuration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…config Delete the default-strategy Databricks notebook and all remaining references (Config paths, NotebookVariant literal, service dispatch branches, interface docstring). Fix README notebook count (three → two). Increase Databricks driver memory (9g → 14g) and overhead (512m → 1g) to address OOMs, cap maxResultSize at 8g to stay within driver heap.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DATABRICKS_DRIVER_MAX_RESULT_SIZEto 16g2026-05-23.1Test plan
benchmarks.ymlparses correctly viapython main.pydry-runpython -c "from src.presentation.entrypoints import *"