Skip to content

CBG-5379 add resync stats for targeted/error#8294

Open
torcolvin wants to merge 7 commits into
mainfrom
CBG-5379
Open

CBG-5379 add resync stats for targeted/error#8294
torcolvin wants to merge 7 commits into
mainfrom
CBG-5379

Conversation

@torcolvin
Copy link
Copy Markdown
Collaborator

@torcolvin torcolvin commented May 22, 2026

CBG-5379 add resync stats for targeted/error

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Integration Tests

@torcolvin torcolvin self-assigned this May 22, 2026
Copilot AI review requested due to automatic review settings May 22, 2026 17:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds additional DCP resync status metrics (targeted document estimate + errored document count) and introduces a collection-level document count query to support those stats.

Changes:

  • Add a CountAllDocs helper (view + N1QL) and a new N1QL COUNT(*) query type/stat (QueryTypeCountDocs).
  • Track and expose new resync stats (docs_targeted, docs_error_count) in DCP resync status, including persistence/resume support.
  • Extend resync tests to validate DocsTargeted behavior; add unit test for CountAllDocs.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
db/query.go Adds QueryCountDocs and CountAllDocs to count documents via view/N1QL.
db/query_test.go Adds unit test coverage for CountAllDocs across delete/purge cases.
db/database.go Registers the new query type for query stats initialization.
db/database_collection.go Introduces DatabaseCollections helper type for resync collection handling.
db/background_mgr_resync_dcp.go Adds targeted/errored resync stats and collection selection refactor.
db/background_mgr_resync_dcp_test.go Extends resync init/start/resume tests to assert DocsTargeted.
Comments suppressed due to low confidence (2)

db/background_mgr_resync_dcp.go:521

  • This change adds new fields to the resync status JSON (docs_error_count and docs_targeted) returned by GET /{db}/_resync. The OpenAPI schema Resync-status in docs/api/components/schemas.yaml currently documents only docs_changed and docs_processed, so the API specs should be updated to include these new properties (and required-ness if applicable).
	DocsErrored   int64  `json:"docs_error_count"`
	DocsTargeted  uint64 `json:"docs_targeted"`
}

// SetProcessStatus reports the new status that was serialized to the bucket along with the last status that polled
// using GetProcessStatus.
func (r *ResyncManagerDCP) SetProcessStatus(ctx context.Context, previousStatus []byte, newStatus []byte) {

db/background_mgr_resync_dcp.go:254

  • New docs_error_count/DocsErrored behavior is introduced (local counter increment + surfaced in status), but there doesn’t appear to be any unit/integration test asserting it increments on a resync document failure and is reported via GET /{db}/_resync. Adding a focused test that forces a resync update error and checks DocsErrored would help prevent regressions.
		} else if err != base.ErrUpdateCancel {
			r.docsErroredLocal.Add(1)
			base.WarnfCtx(ctx, "Resync: Error updating doc %q: %v", base.UD(docID), err)
			return false
		}
		return true
	}

	if r.hasAllCollections {

Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/database_collection.go
Comment on lines +442 to +460
// DatabaseCollections is a set of collections
type DatabaseCollections []*DatabaseCollection

// getIDs returns the collection IDs for each collection
func (c *DatabaseCollections) getIDs() []uint32 {
ids := make([]uint32, 0, len(*c))
for _, collection := range *c {
ids = append(ids, collection.GetCollectionID())
}
return ids
}

// getNames return the names of the collections
func (c *DatabaseCollections) getNames() base.CollectionNames {
names := base.NewCollectionNames()
for _, collection := range *c {
names.Add(collection.dataStore)
}
return names
Comment thread db/query.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Redocly previews

Copy link
Copy Markdown
Collaborator

@adamcfraser adamcfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General approach looks good, some questions on the details.

Comment thread db/query.go

// View Query
if c.useViews() {
opts := Body{"stale": false, "reduce": true}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not relying on this for anything other than an estimate, can we use "stale":ok for improved view performance?

Comment thread db/query.go
countDocsQueryStatement := replaceSyncTokensQuery(QueryCountDocs.statement, c.UseXattrs())
countDocsQueryStatement = replaceIndexTokensQuery(countDocsQueryStatement, sgIndexes[IndexAllDocs], c.UseXattrs(), c.numIndexPartitions())

results, err := N1QLQueryWithStats(ctx, c.dataStore, QueryTypeCountDocs, countDocsQueryStatement, nil, base.RequestPlus, QueryCountDocs.adhoc, c.dbStats(), c.slowQueryWarningThreshold())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above - should we use base.NotBounded instead of base.RequestPlus since we're not requiring that level of consistency for the results?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use NotBounded here, then this value is wrong when running the integration tests. I'm inclined to change this back to RequestPlus because it largely doesn't matter and then we'd know if the query goes wrong.

This seems to be a problem in a few different ways.

  • Add 5 docs, sometimes not all 5 show up
  • Other tests from the bucket pool actually have old documents/tombstones that are found. These are purged but still show up in the index.

Comment thread db/query_test.go
Comment thread docs/api/components/schemas.yaml
Comment thread db/query.go Outdated
var row struct {
Count uint64 `json:"count"`
}
if results.Next(ctx, &row) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good use case for results.One?

Comment thread db/database_collection.go
func (c *DatabaseCollections) getNames() base.CollectionNames {
names := base.NewCollectionNames()
for _, collection := range *c {
names.Add(collection.dataStore)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unexpected to have to go to the dataStore to get the scope/collection names when they are already copied to DatabaseCollection, but possibly it's more reliable in case someone doesn't use the constructor to build DatabaseCollection?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just shorter for me as a programmer to use this than collection.Scope and collection.Name because Add takes a DataStore object. No particular reason.

Comment thread db/background_mgr_resync_dcp.go Outdated
Comment thread db/background_mgr_resync_dcp.go Outdated

docsTargeted, err := totalResyncDocs(ctx, collections)
if err != nil {
return err
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering whether we should actually error out here, or proceed without a docsTargeted stat. Seems like it would be preferable for an unexpected error performing the count to not block resync more generally.

@torcolvin torcolvin assigned adamcfraser and unassigned torcolvin May 26, 2026
adamcfraser
adamcfraser previously approved these changes May 26, 2026
Copy link
Copy Markdown
Collaborator

@adamcfraser adamcfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants