Skip to content

[cli] Support cursor pagination for fetching requestIdentifiers#12

Draft
bencmbrook wants to merge 1 commit into
bencmbrook/sdk-metadata-cursor-paginationfrom
bencmbrook/migrate-cli-pr-533
Draft

[cli] Support cursor pagination for fetching requestIdentifiers#12
bencmbrook wants to merge 1 commit into
bencmbrook/sdk-metadata-cursor-paginationfrom
bencmbrook/migrate-cli-pr-533

Conversation

@bencmbrook

@bencmbrook bencmbrook commented Mar 19, 2026

Copy link
Copy Markdown
Member

Port of transcend-io/cli#533

Stacked on #198 (GraphQL metadata cursor pagination). Review/merge that first; this PR's diff is scoped to the Sombra request-identifier fetching.

Summary

  • Switch Sombra request identifier fetching from offset/per-request calls to cursor-based pagination plus parallel batched lookups, threaded through request export, restart, and manual enrichment.
  • Fetch identifiers in parallel groups (split requestIds into concurrency groups per page) instead of one sequential cursor walk.
  • Bound total Sombra connections via a shared budget (MAX_IDENTIFIER_FETCH_CONCURRENCY, default 200) divided across --concurrency date-range chunks, so the no-flags default stays competitive and large parallel exports don't overload Sombra. Tunable via the new request export --maxIdentifierConcurrency flag.
  • Deprecate the now-unused per-request fetchAllRequestIdentifiers in favor of fetchRequestIdentifiersBatch.

Context

Performance (staging, 40,055 requests, --concurrency 20)

  • This branch: ~597s, completed
  • main baseline (per-request fan-out): ~1122s, and dropped ~25% of requests at single-chunk scale (retry exhaustion from ~2k concurrent connections)
  • Output verified identical (only the time-relative daysRemaining field differs between runs)

Test plan

  • typecheck / build / test / check-exports / quality
  • unit tests for fetchRequestIdentifiersBatch (paging, retry, empty/missing ids, parallel grouping)
  • manual A/B vs main on EU prod (1k) and staging (40k)

Made with Cursor

@pkg-pr-new

pkg-pr-new Bot commented Mar 19, 2026

Copy link
Copy Markdown

Open in StackBlitz

@transcend-io/airgap.js-types

pnpm add https://pkg.pr.new/@transcend-io/airgap.js-types@12
yarn add https://pkg.pr.new/@transcend-io/airgap.js-types@12.tgz

@transcend-io/cli

pnpm add https://pkg.pr.new/@transcend-io/cli@12
yarn add https://pkg.pr.new/@transcend-io/cli@12.tgz

@transcend-io/internationalization

pnpm add https://pkg.pr.new/@transcend-io/internationalization@12
yarn add https://pkg.pr.new/@transcend-io/internationalization@12.tgz

@transcend-io/privacy-types

pnpm add https://pkg.pr.new/@transcend-io/privacy-types@12
yarn add https://pkg.pr.new/@transcend-io/privacy-types@12.tgz

@transcend-io/sdk

pnpm add https://pkg.pr.new/@transcend-io/sdk@12
yarn add https://pkg.pr.new/@transcend-io/sdk@12.tgz

@transcend-io/type-utils

pnpm add https://pkg.pr.new/@transcend-io/type-utils@12
yarn add https://pkg.pr.new/@transcend-io/type-utils@12.tgz

@transcend-io/utils

pnpm add https://pkg.pr.new/@transcend-io/utils@12
yarn add https://pkg.pr.new/@transcend-io/utils@12.tgz

@transcend-io/mcp

pnpm add https://pkg.pr.new/@transcend-io/mcp@12
yarn add https://pkg.pr.new/@transcend-io/mcp@12.tgz

@transcend-io/mcp-server-admin

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-admin@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-admin@12.tgz

@transcend-io/mcp-server-assessment

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-assessment@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-assessment@12.tgz

@transcend-io/mcp-server-base

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-base@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-base@12.tgz

@transcend-io/mcp-server-consent

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-consent@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-consent@12.tgz

@transcend-io/mcp-server-discovery

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-discovery@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-discovery@12.tgz

@transcend-io/mcp-server-dsr

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-dsr@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-dsr@12.tgz

@transcend-io/mcp-server-inventory

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-inventory@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-inventory@12.tgz

@transcend-io/mcp-server-preferences

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-preferences@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-preferences@12.tgz

@transcend-io/mcp-server-workflows

pnpm add https://pkg.pr.new/@transcend-io/mcp-server-workflows@12
yarn add https://pkg.pr.new/@transcend-io/mcp-server-workflows@12.tgz

commit: cbcdb27

@bencmbrook bencmbrook changed the title Improve request identifier pagination in monorepo CLI Support cursor pagination for fetching requestIdentifiers Mar 19, 2026
@bencmbrook bencmbrook changed the title Support cursor pagination for fetching requestIdentifiers [cli] Support cursor pagination for fetching requestIdentifiers Mar 19, 2026
@bencmbrook

bencmbrook commented Mar 19, 2026

Copy link
Copy Markdown
Member Author

Summary comparing the original transcend-io/cli#533 to the monorepo port tools#12:

Area Original cli#533 Monorepo port tools#12 Notes
RequestIdentifier GraphQL query Already used cursor pagination via after + pageInfo Preserved This behavior was not introduced during the port
fetchAllRequestIdentifierMetadata() Already cursor-paginated Preserved Carried over with monorepo import/path normalization only
fetchAllRequestIdentifiers() Already cursor-paginated Preserved Carried over
fetchRequestIdentifiersBatch() Present in original PR Preserved Carried over so downstream flows can batch Sombra lookups
pullPrivacyRequests() Batch-fetched identifiers across returned requests Preserved Still avoids one Sombra request per request
streamPrivacyRequestsToCsv() Batch-fetched identifiers per page/chunk Preserved Still uses batched identifier lookup during streaming export
pullManualEnrichmentIdentifiersToCsv() Fetched enrichers first, then batch-fetched identifiers for qualifying requests Preserved Output behavior stays aligned with original PR
bulkRestartRequests() Prefetched identifiers in batch when copyIdentifiers was enabled Preserved Same behavior in monorepo
splitDateRange.ts Introduced as a shared helper in the original PR Preserved Added to packages/cli/src/lib/requests/splitDateRange.ts
requests/index.ts export Exported splitDateRange Preserved Added to the monorepo requests barrel as well
Monorepo-only adjustment Reason
Files moved from src/** to packages/cli/src/** Fit the monorepo layout
Relative imports normalized to .js ESM specifiers Match this repo’s TypeScript/NodeNext conventions
Added .changeset/rude-steaks-jam.md Required for the monorepo release process
Category Dropped in port? Notes
Functional source changes from original cli#533 No All 9 source-file changes from the original PR were carried over
Cursor pagination / batched identifier fetching behavior No Preserved as implemented in the original PR
Docs / README work None needed Original PR was effectively code-only, and the port remained code-only aside from the changeset
Repo config / workflow changes None No extra repo-level changes were needed for this port

Validation on the monorepo port:

  • pnpm -F @transcend-io/cli typecheck
  • pnpm -F @transcend-io/cli build
  • pnpm -F @transcend-io/cli test
  • pnpm -F @transcend-io/cli check-exports
  • pnpm quality

Bottom line: the port preserved the original cli#533 behavior. The only intentional differences were monorepo-specific adaptations such as file relocation, ESM import normalization, and adding a changeset.

@bencmbrook bencmbrook marked this pull request as draft March 29, 2026 03:18
iamtheluckyest added a commit that referenced this pull request Jun 11, 2026
Resolve conflicts from the CLI->SDK move by adopting PR #12's cursor
pagination and batched identifier fetching, layered on top of main's
withTransientRetry wrapper, logger params, and date-range chunking.
Switch Sombra request-identifier fetching to cursor pagination with
batched, parallel lookups across the request set, bounded by a shared
connection budget that is divided across --concurrency date-range chunks.
Add a --maxIdentifierConcurrency flag to tune the budget, unit tests for
fetchRequestIdentifiersBatch, and deprecate the per-request
fetchAllRequestIdentifiers.
@iamtheluckyest iamtheluckyest force-pushed the bencmbrook/migrate-cli-pr-533 branch from daa5e7a to cbcdb27 Compare June 11, 2026 16:58
@iamtheluckyest iamtheluckyest changed the base branch from main to bencmbrook/sdk-metadata-cursor-pagination June 11, 2026 16:58
Comment on lines +103 to +105
* @deprecated Use {@link fetchRequestIdentifiersBatch} instead, which batches
* multiple requests into a single paginated call. This per-request variant is
* retained only for backwards compatibility and has no internal callers.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has no callers in this repo anymore, but my understanding is that it is exposed such that customers could be using it? So I marked it as deprecated, but idk if that's actually what we want to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants