[BDP-102028] feat(optimizer): [2/N] Optimizer REST Service and Controller by mkuchenbecker · Pull Request #531 · linkedin/openhouse

mkuchenbecker · 2026-04-06T18:02:12Z

Optimizer Stack

PR	Content
#527	Data Model
#530	Database Repos
#531 (this)	REST service
#533	Analyzer app
#534	Scheduler app
#tbd	Spark BatchedOFD app
#tbd	Infra, docker-compose, smoke test

Summary

PR 2 of N in the optimizer stack.
Overall Project
Service Design doc.

Service layer and REST controllers for the optimizer service, plus the apps/optimizer shared module providing lightweight entity/repo copies for the analyzer and scheduler apps.

Changes

Service layer: OptimizerDataService interface and OptimizerDataServiceImpl — CRUD operations, complete-operation lifecycle, stats upsert with history double-write, filtered queries.

Controllers: TableOperationsController, TableOperationsHistoryController, TableStatsController — REST endpoints per the design doc API spec.

Shared module (apps/optimizer): Lightweight entity and repository copies used by the analyzer and scheduler apps to read optimizer state directly from MySQL.

Testing Done

Manually Tested on local docker setup. Please include commands ran, and their output.
Added new tests for the changes made.
Updated existing tests to reflect the changes made.
No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
Some other form of testing like staging or soak time in production. Please explain.

H2 integration tests in OptimizerDataServiceImplTest (5 tests):

completeOperation_writesHistoryFromOperationRow — saves SCHEDULED row, completes it, asserts history DTO fields
completeOperation_notFound_returnsEmpty — completes nonexistent ID, asserts empty
upsertTableStats_createsNewRow — upserts new table, asserts DTO and repo row
upsertTableStats_updatesExistingRow — upserts twice, asserts overwrite with single row
upsertTableStats_appendsHistoryOnEveryCall — upserts twice, asserts 2 history rows

./gradlew :services:optimizer:test
# BUILD SUCCESSFUL — all 25 tests pass (repo tests from PR 1 + 5 new service tests)

Additional Information

Breaking Changes
Deprecations
Large PR broken into smaller PRs, and PR plan linked in the description.

Service interface and implementation for all optimizer CRUD operations including complete-operation lifecycle, stats upsert with history double-write, and filtered queries. Three REST controllers expose the endpoints. The apps/optimizer shared module provides lightweight entity/repo copies for the analyzer and scheduler apps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Align OptimizerDataServiceImpl with renamed repository methods from optimizer-1 review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve repo conflicts by taking optimizer-1's clean find-only versions. Scheduler-specific methods and streamAll removed per review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mkuchenbecker

this needs tests

Propagate CompleteOperationRequest orphan field removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

H2 integration tests for OptimizerDataServiceImpl covering completeOperation (write history, not-found) and upsertTableStats (create, update, history append). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Strengthen upsertTableStats test to verify history rows contain the raw delta stats from each call, not just the row count. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

abhisheknath2011 · 2026-04-13T14:53:34Z

+   * with the history row, or 404 if the operation does not exist.
+   */
+  @PostMapping("/{id}/complete")
+  public ResponseEntity<TableOperationsHistoryDto> completeOperation(


We need table name and database name as input. We can keep the url format same as how tables sevice urls are specified like v1/databases/DB/tables/TABLE

Or can be passed as parameters.

These APIs are intentionally keyed by table UUID because of drop-and-recreate semantics: a recreated table is a brand-new entity for the optimizer (new stats, new storage, new operation history), and a name-based key would conflate two distinct identities. The Spark caller of /{id}/complete already has the operation id. We'll add a name-based variant when a concrete use case lands; today the only such use case is operation-history browsing, which is covered separately.

abhisheknath2011 · 2026-04-13T14:54:49Z

+
+  /** Fetch a single operation row by its ID, regardless of status. Returns 404 if not found. */
+  @GetMapping("/{id}")
+  public ResponseEntity<TableOperationsDto> getTableOperation(@PathVariable String id) {


Same comment database name and table name needed.

Same answer — fetch-by-id is intentional for the same drop-and-recreate reason. The list endpoint at the controller root already accepts databaseName / tableName as optional query-param filters when a multi-criteria browse is needed.

abhisheknath2011 · 2026-04-13T15:03:45Z

+
+/** REST controller for {@code table_operations}. */
+@RestController
+@RequestMapping("/v1/table-operations")


Can we have common format for all urls like common prefix /v1/optimizer/ and operations can be be suffix. So the url can be something like /v1/optimizer/operations.

Claude: Renamed to /v1/optimizer/operations. Applied the same /v1/optimizer/... namespacing across all three controllers.

abhisheknath2011 · 2026-05-13T17:09:21Z

+@RestController
+@RequestMapping("/v1/optimizer/databases/{databaseName}/tables/{tableName}")
+@RequiredArgsConstructor
+public class TableByNameController {


I think we can merge this controller with TableOperationsController or TableOperationsHistoryController. Basically in these controllers additional endpoint can be added.

abhisheknath2011 · 2026-05-13T17:24:44Z

+   */
+  @PutMapping("/{tableUuid}")
+  public ResponseEntity<TableStatsDto> upsertTableStats(
+      @PathVariable String tableUuid, @RequestBody UpsertTableStatsRequest request) {


Is the request body UpsertTableStatsRequest not part of this PR?

Its in the first PR.

#527

@RequestParam

…e layer OptimizerDataService's filter-style methods (listTableOperations, listTableStats, getStatsHistory, listHistory) accepted nullable strings/ enums to mean "no filter". Switch to Optional<T> at the service boundary; controllers wrap their nullable @RequestParam values via Optional.ofNullable. The implementation unwraps via .orElse(null) at the JPA repo call site — the @query "IS NULL OR ..." pattern is idiomatic with nullable parameters and stays unchanged. No behavior change. No tests required updating. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rename wire HistoryStatus services/optimizer no longer maintains its own JPA entities or Spring Data repositories for the four optimizer DB tables. Apps/optimizer is the single source of truth; services/optimizer depends on apps/optimizer (renamed via project(':apps:optimizer').name = 'optimizer-data' to dodge the project.name collision that previously caused a self-referential dependency error). Removed from services/optimizer: - entity/{TableOperationsRow, TableOperationsHistoryRow, TableStatsRow, TableStatsHistoryRow}.java - repository/{TableOperationsRepository, TableOperationsHistoryRepository, TableStatsRepository, TableStatsHistoryRepository}.java - api/model/TableStats.java (duplicate) - api/model/OperationHistoryStatus.java (renamed → HistoryStatus to match the internal enum naming) - config/JobResultConverter.java (no longer needed — entity stores result as raw String JSON and the mapper converts at the wire boundary) Added on apps-side: - TableStatsHistoryRow + TableStatsHistoryRepository (previously only on services-side) - jobId + result fields on TableOperationsHistoryRow so it covers all services-side use cases - find(...) on TableOperationsHistoryRepository extended to the 8-filter service-layer shape (databaseName, tableName, tableUuid, operationType, status, since, until, pageable) - toBuilder = true on TableStatsRow so OptimizerDataServiceImpl.upsertTableStats can use the existing.toBuilder() pattern Mapper updates: - OptimizerMapper gains String ↔ wire-enum helpers and a JSON ↔ JobResult pair (replaces the old JPA AttributeConverter approach). - OptimizerDataServiceImpl unwraps Optional<Enum> filters via .name() before calling the now-shared apps-side repos. Tests updated to match: entity-builder calls pass enum.name() Strings; repo.find(...) args reordered to apps-side (operationType, status, tableUuid, databaseName, tableName); JobResult.builder() in test fixtures replaced with literal JSON strings to match the String-typed result column. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Follow-up to the optimizer-1 module move: - services/optimizer/build.gradle: drop the now-dead `implementation project(':apps:optimizer-data')` (target module removed in the prior merge). - Restore services-side TableStatsHistoryRepository (lost in the merge because optimizer-1 did not touch its services-side copy, but optimizer-2's HEAD had removed it during the R7-5 consolidation). - Drop the multi-filter `listHistory` service method, its controller endpoint, and the standalone TableByNameController. Callers use `getHistory(tableUuid, limit)` which now uses the simplified `findByTableUuidOrderByCompletedAtDesc` derived query. - TableStatsRow: enable `@Builder(toBuilder = true)` so `upsertTableStats` can build from the existing row.

Adapt the REST service layer to the new architecture introduced on optimizer-0 and optimizer-1: - api/ and db/ data types are self-contained per layer. - model/mapper/ApiModelMapper and model/mapper/ModelDbMapper own all cross-layer translation. - The old api/mapper/OptimizerMapper is gone. - JobResult is removed from the wire entirely. - TableStats is split on the DB side: TableStatsRow holds only the snapshot; TableStatsHistoryRow holds snapshot + delta per commit. Changes: OptimizerDataServiceImpl rewrite: - Inject ApiModelMapper + ModelDbMapper instead of OptimizerMapper. - Operations: list/get/complete/append go db row → ModelDbMapper → model object → ApiModelMapper → wire DTO. Enum filters on list() translate api → model → db. - completeOperation: signature is now (CompleteOperationRequest) only; operationId lives in the body. No jobId / result on the written history row. - Stats: split api.TableStats into snapshot (current-state row) and snapshot+delta (history row) at write time. Join back to the wire TableStats at read time (current-state has snapshot only; history has both). OptimizerDataService interface: - completeOperation(CompleteOperationRequest) — drop the String id path-style parameter. TableOperationsController: - POST endpoint moves from /{id}/complete to /complete. operationId is read from the request body. application.properties: - Re-introduced with production runtime config (server.port, application name, actuator) and JPA/MySQL datasource + schema-init pointing at the schema added on optimizer-1. OptimizerDataServiceImplTest: rewritten to use api/ + db/ types, new completeOperation signature, and the split snapshot/delta on stats; drop JobResult-dependent assertions.

Push the api/model boundary out of the service entirely. After this commit, calling into OptimizerDataService never returns or accepts a wire DTO; controllers (or any future CLI / in-process consumer) own the marshalling at their own edge. Service interface: - All return types and parameters are model/ types or primitives. - completeOperation(String operationId, model.HistoryStatus status). - upsertTableStats(model.Table table) — caller supplies a Table; the service stamps Table.updatedAt and returns the updated Table. - listTableOperations / getStatsHistory / etc. return Lists of model types. Service impl: - Drop ApiModelMapper injection. Only depends on ModelDbMapper. - All conversions are db row → ModelDbMapper → model. The new toStatsHistory mapper method (landed on optimizer-1) handles the history-row case. The updated toTable now stamps Table.updatedAt from the row. Controllers (api/controller/*): - TableOperationsController, TableOperationsHistoryController, TableStatsController now inject ApiModelMapper and do api↔model conversion at the boundary. Each controller method takes api request types, converts to model, calls the service, converts the returned model back to api DTOs. - TableOperationsController.complete continues to take the operationId from the request body. Test: - OptimizerDataServiceImplTest now exercises the service in model types: builders create model.Table, assertions read model.HistoryStatus / model.OperationType / model.TableStats etc. Verification: `git grep "import com.linkedin.openhouse.optimizer.api" -- services/optimizer/src/main/java/com/linkedin/openhouse/optimizer/service/` returns empty.

The api↔model and model↔db boundaries no longer go through injected mapper beans. Switch every call site to the to/from methods that now live on the types themselves. OptimizerDataServiceImpl: - Drop the ModelDbMapper field. No DI at all (only repositories). - Row → model via TableOperation::fromRow, TableOperationsHistory::fromRow, Table::fromRow, TableStatsHistory::fromRow. - Model → row via instance methods: history.toRow(), table.toBuilder() ...build().toRow(), and TableStats stats.toSnapshotRow() / .toDeltaRow(). - Enum filters on list() use OperationType::toDb / OperationStatus::toDb method references. Controllers (TableOperationsController, TableOperationsHistoryController, TableStatsController): - Drop the ApiModelMapper field. - api → model on the way in: dto.toModel(), request.toModel(uuid), request.getStatus().toModel(), apiEnum.toModel(). - model → api on the way out: Dto.fromModel(modelObj).

…t Table OptimizerDataService.upsertTableStats / getTableStats / listTableStats now operate on model.TableStats. The service stays decoupled from Table — stats are stats, not tables. Conversion to TableStatsRow goes through TableStats.toRow / fromRow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This was referenced Apr 6, 2026

[BDP-102028] feat(optimizer): [1/N] Optimizer Database #530

Open

[BDP-102028] feat(optimizer): [0/N] Optimizer API and internal model #527

Open

mkuchenbecker marked this pull request as draft April 6, 2026 18:03