Skip to content

refactor(change): persist typed change details from the change provider#195

Open
behinddwalls wants to merge 1 commit into
change-storefrom
change-details
Open

refactor(change): persist typed change details from the change provider#195
behinddwalls wants to merge 1 commit into
change-storefrom
change-details

Conversation

@behinddwalls
Copy link
Copy Markdown
Collaborator

@behinddwalls behinddwalls commented Jun 4, 2026

Summary

Why?

The change provider already produces rich per-URI facts (author, changed
files, line counts), but its value types lived in the extension layer and
the data was thrown away — validate fetched ChangeInfo only to log a file
count, and ChangeRecord stored an opaque Metadata JSON string that was never
written. Nothing downstream could read typed change facts.

What?

  • Move the change value types into entities: entity.User, entity.ChangedFile
    (now with LinesModified), entity.ChangeDetails (the facts), and
    entity.ChangeInfo (URI -> Details), with aggregation helpers. The
    changeprovider extension and GitHub impl now produce these.
  • Replace ChangeRecord.Metadata (opaque string) with typed Details
    (ChangeDetails); the change table's metadata JSON column becomes details.
  • Add ChangeStore.UpdateDetails — a version-guarded conditional write,
    following the optimistic-locking contract (arithmetic in the controller).
  • validate now persists each fetched ChangeInfo onto the request's change
    records (per-URI, idempotent; ErrVersionMismatch is a benign no-op).

This is the producer half: typed details now exist and are persisted. The
score controller consumes them in a follow-up.

Test Plan

  • make build, make test, make lint, make check-mocks/gazelle/tidy
  • make integration-test (storage contract suite round-trips Details and
    covers UpdateDetails create/update/version-mismatch)

Issues

Stack

  1. refactor(storage): fold ChangeStore into storage, drop dead store #191
  2. @ refactor(change): persist typed change details from the change provider #195
  3. refactor(scorer): score batches over typed change details #196
  4. feat(extensions): fake implementations with error injection #197

package entity

// User represents the author of a change.
type User struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User is very generic name. May be "ChangeAuthor" or "Author" ?

type ChangedFile struct {
// Path is the file path relative to the repository root.
Path string `json:"path"`
// Patch is the diff patch content for this file.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always store it? If not, may be give expectations.

return fmt.Errorf("failed to marshal details for change record queue=%s uri=%s request_id=%s: %w", record.Queue, record.URI, record.RequestID, err)
}

const query = "UPDATE `change` SET details = ?, updated_at = ?, version = ? WHERE queue = ? AND uri = ? AND request_id = ? AND version = ?"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If details contain a patch, wouldn't it be expensive to rewrite it every time?

// record for this request. For each info it reads the record at (queue, info.URI,
// request.ID) — the row start.claimURIs created — and conditionally writes the
// details, bumping the version. A missing record or a lost version race surfaces
// as ErrVersionMismatch, treated as a benign no-op so redelivery stays idempotent.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we make it immutable and fixed at Change info creation, or have separate entities and datastores for immutable/mutable fields?

## Summary

### Why?

The change provider already produces rich per-URI facts (author, changed
files, line counts), but its value types lived in the extension layer and
the data was thrown away — validate fetched ChangeInfo only to log a file
count, and ChangeRecord stored an opaque Metadata JSON string that was never
written. Nothing downstream could read typed change facts.

### What?

- Move the change value types into entities: entity.User, entity.ChangedFile
  (now with LinesModified), entity.ChangeDetails (the facts), and
  entity.ChangeInfo (URI -> Details), with aggregation helpers. The
  changeprovider extension and GitHub impl now produce these.
- Replace ChangeRecord.Metadata (opaque string) with typed Details
  (ChangeDetails); the change table's metadata JSON column becomes details.
- Add ChangeStore.UpdateDetails — a version-guarded conditional write,
  following the optimistic-locking contract (arithmetic in the controller).
- validate now persists each fetched ChangeInfo onto the request's change
  records (per-URI, idempotent; ErrVersionMismatch is a benign no-op).

This is the producer half: typed details now exist and are persisted. The
score controller consumes them in a follow-up.

## Test Plan

- ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy`
- ✅ `make integration-test` (storage contract suite round-trips Details and
  covers UpdateDetails create/update/version-mismatch)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants