[DOM-75519] feat: migrate job launcher from v4 to v1/beta Domino Jobs API#19
[DOM-75519] feat: migrate job launcher from v4 to v1/beta Domino Jobs API#19ddl-subir-m wants to merge 17 commits intosubir/pr1-http-layerfrom
Conversation
- Add DominoProjectType enum (DFS/GIT/UNKNOWN) with filesystem-based detection - Add _db_url_remap for cross-project SQLite URL remapping across mount types - Add tabular_data module: centralized CSV/parquet preview, schema, row counting with LRU caching (replaces scattered pd.read_csv/parquet calls) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lver - Add normalize_leaderboard_rows/payload to fix TimeSeries fit_time display - Add resolve_request_project_id() to centralize project context extraction from X-Project-Id header, query params, and DOMINO_PROJECT_ID env var Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…roject_id The env var is the App's own project, not the target project the user is working in. Falling back to it silently operates on the wrong project (root cause of datasets showing empty in cross-project scenarios). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add DominoProjectType enum (DFS/GIT/UNKNOWN) with filesystem-based detection - Add _db_url_remap for cross-project SQLite URL remapping across mount types - Add tabular_data module: centralized CSV/parquet preview, schema, row counting with LRU caching (replaces scattered pd.read_csv/parquet calls) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Switch job launch from POST /v4/jobs/start to POST /api/jobs/v1/jobs
- Switch job status from GET /v4/jobs/{id} to GET /api/jobs/beta/jobs/{id}
- Remove _resolve_hardware_tier_id (v1 API accepts tier name directly)
- Add _job_api_request with direct-host-first fallback
- Add _remap_db_url_for_target for cross-project database paths
- Pass database_url and job_config as CLI args to workers
DOMINO_ENVIRONMENT_ID and DOMINO_ENVIRONMENT_REVISION_ID are set on the App container and identify the compute environment with the right dependencies. Using env vars eliminates per-caller plumbing and ensures child jobs always match the App's environment. Removes environment_id param from _job_start, start_training_job, and start_eda_job. Adds environmentRevisionId to job payload. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve the training data path at job creation time and pass it as --file-path to the Domino Job command. The worker uses the path directly instead of needing dataset API access at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1cc4fba to
1bdf84a
Compare
Query params are the canonical approach going forward. The X-Project-Id header is kept as a fallback for legacy clients only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The frontend sends both header and query param from the same source. No scenario where header is present but query param isn't. Query param only — simpler. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[DOM-75514] feat: core HTTP layer enhancements + debug middleware
[DOM-75515] feat: project type detection, DB URL remap, tabular data helpers
…tils [DOM-75516] feat: leaderboard normalization utils + request project ID resolver
| id_column=request.id_column, | ||
| rolling_window=request.rolling_window, | ||
| hardware_tier_name=request.domino_hardware_tier_name or settings.domino_eda_hardware_tier_name, | ||
| environment_id=request.domino_environment_id or settings.domino_eda_environment_id, |
There was a problem hiding this comment.
why remove this? Will the environment variables for specifying the environment to use still work?
There was a problem hiding this comment.
The environment_id parameter was moved to DominoJobLauncher.init (domino_job_launcher.py:36-37) where it reads DOMINO_ENVIRONMENT_ID from env vars once at construction. So it no longer needs to be passed per call from the profiling route. The env var still works it's just read in one place now instead of being threaded through every call site.
|
|
||
| The App's DB lives at e.g. ``/mnt/data/automl_shared_db/automl.db`` | ||
| (local mount). A Job running in a *different* project sees the | ||
| App's data under ``/mnt/imported/data/`` instead. Swap the prefix |
There was a problem hiding this comment.
We are eliminating the need for the imported data. We should not include this in the functionality here
There was a problem hiding this comment.
Removed. The _remap_db_url_for_target method and all /mnt/imported/data/ remapping logic is gone. Jobs now pass self.settings.database_url directly.
| async def _job_api_request(self, method: str, path: str, **kwargs) -> httpx.Response: | ||
| """Call a Jobs API endpoint, preferring the direct host over the proxy.""" | ||
| last_exc: Optional[Exception] = None | ||
| base_urls = self._job_api_base_urls() |
There was a problem hiding this comment.
I don't think it's necessary to figure out multiple base urls. Just send to the DOMINO_API_HOST
There was a problem hiding this comment.
Simplified. Removed _job_api_base_urls() and the _job_api_request() fallback loop. All calls now go through DOMINO_API_HOST via the generated client.
| request_kwargs.setdefault("max_retries", 0) | ||
| is_last = index == len(base_urls) - 1 | ||
| try: | ||
| return await domino_request( |
There was a problem hiding this comment.
Use the generate public api client for this
There was a problem hiding this comment.
Done. _job_start now uses start_job.asyncio_detailed() with NewJobV1 from the generated public API client. get_job_status uses get_job_details.asyncio_detailed(). Only stop_job still uses raw domino_request since there's no public API for v4 stop.
…ed client Per Niole's review: - Remove _remap_db_url_for_target (eliminating imported data pattern) - Remove multi-base-URL fallback, use DOMINO_API_HOST directly - Replace raw domino_request() calls with generated public API client for job start (start_job) and status (get_job_details) - Keep domino_request only for v4 stop (no public API alternative) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Why
The job launcher currently uses Domino's v4 Jobs API (
/v4/jobs/start,/v4/jobs/{id}), which is internal and undocumented. This creates risk — internal APIs can change without notice.The public v1 API (
/api/jobs/v1/jobs) is documented, stable, and has two advantages:runCommandinstead ofcommandToRun)Additionally, cross-project training jobs need the database URL remapped because the SQLite path written by the App (
/domino/datasets/local/automl-extension/automl.db) doesn't exist in the target project's mount layout. The launcher now passes--database-urland--job-configas CLI args to workers.Depends on
domino_requestenhancements)Summary
POST /v4/jobs/start→POST /api/jobs/v1/jobsGET /v4/jobs/{id}→GET /api/jobs/beta/jobs/{id}_resolve_hardware_tier_id()(no longer needed)_job_api_request()with nucleus-first, proxy-fallback routing_remap_db_url_for_target()for cross-project SQLite pathsdatabase_urlandjob_configCLI args to training/EDA workersFiles changed
app/core/domino_job_launcher.py— rewritten job launch and status APIstests/test_domino_job_launcher.py— new testsTest plan
test_domino_job_launcher.pypasses