Skip to content

feat(controller) Persist batch and batch dependent entities in store#119

Closed
manjari25 wants to merge 3 commits into
mainfrom
manjari/batch-controller
Closed

feat(controller) Persist batch and batch dependent entities in store#119
manjari25 wants to merge 3 commits into
mainfrom
manjari/batch-controller

Conversation

@manjari25
Copy link
Copy Markdown
Contributor

@manjari25 manjari25 commented Mar 4, 2026

Summary

feat(controller) Persist batch and batch dependent entities in store

What?

  • Update the batch store with newly created batch (request maps 1:1 to batch)
  • Use newly added core/metrics pkg to emit metrics
  • Add a new batch state ("ready") to indicate success of all store operations in this controller

Why?

Enables sequential batch ordering per queue so that downstream stages (score, speculate, merge) can process batches in the correct dependency order.

Test Plan

Issues

@manjari25 manjari25 marked this pull request as ready for review March 4, 2026 23:49
@manjari25 manjari25 requested review from a team, behinddwalls and sbalabanov as code owners March 4, 2026 23:49
)
c.metricsScope.Counter("counter_errors").Inc(1)
return fmt.Errorf("failed to generate batch ID for queue=%s: %w", request.Queue, err)
return errs.NewRetryableError(fmt.Errorf("failed to generate batch ID for queue=%s: %w", request.Queue, err))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz do not default to retryables

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other review for details

)

// Generate a globally unique batch ID.
counterOp := metrics.Begin(c.metricsScope, "counter_next")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counter has its one metrics, so this is probably redundant?


// Persist batch to storage.
// ErrAlreadyExists should never happen since batch IDs are generated from a unique counter.
batchCreateOp := metrics.Begin(c.metricsScope, "batch_store_create")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

batchCreateOp := metrics.Begin(c.metricsScope, "batch_store_create")
if err := c.store.GetBatchStore().Create(ctx, batch); err != nil {
batchCreateOp.Complete(err)
c.logger.Errorw("failed to create batch in storage",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as we surface error up, avoid logging as it would most likely result in double logging.
Logging should be done at where the error chain terminates, typically the highest function in the call stack

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all QueueControllers, i guess we just log at the consumer.go and leave it as is. For API controllers, we just log at the handlers which would be implemented in go-code?

So i think we should probably never log anywhere unless we are swallowing some errors (ideally we never need to)

"batch_id", batch.ID,
"error", err,
)
if errors.Is(err, storage.ErrAlreadyExists) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not expect it to ever happen if counter is correct, do not need special processing to this error, just fire it up

// ErrAlreadyExists should never happen since batch IDs are generated from a unique counter.
batchCreateOp := metrics.Begin(c.metricsScope, "batch_store_create")
if err := c.store.GetBatchStore().Create(ctx, batch); err != nil {
batchCreateOp.Complete(err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adhoc approach to scope management is fragile
what if some code path below won't return (could be another code change that makes it such). Complete would be called twice.
Prefer defer or another function wrapper to scope the measured block into.

@behinddwalls behinddwalls self-requested a review March 6, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants