fix(perf): batch-fetch missing entities in _measurement_requests_cursor_to_pydantic by michael-johnston · Pull Request #671 · IBM/ado

michael-johnston · 2026-03-07T10:10:33Z

Previously, _measurement_requests_cursor_to_pydantic called entityWithIdentifier in a loop for each entity not already in the local cache, issuing one DB round-trip (~240ms) per unique entity. For an operation touching N distinct entities this was an N+1 query pattern. This is via a pattern of initializing the store an then calling measurement_requests_for_operation or related calls which happens with ado show X operation

It might be expected that self._entities is pre-populated before this method runs. There are two reasons it is not:

The bulk-load mechanism is bypassed. The public entities property triggers refresh(force_fetch_all_entities=True) when self._entities is empty, loading all entities in one query. But measurement_requests- cursor_to_pydantic accesses the private self._entities dict directly, never touching the property, so the lazy-load never fires.
Nothing in the construction path populates the cache. SQLSampleStore starts with self._entities = {}. By the time _measurement_requests_cursor_to_pydantic runs, the cache is still empty and every entity lookup falls through to entityWithIdentifier.

PyMySQL already buffers the full result set in memory before the Python loop begins, so consuming the cursor up-front with fetchall() adds no network round-trip. The fix uses this to collect all missing entity IDs before any entity is resolved, then issues a single _fetch_entities query for all of them at once regardless of N, ensuring the cache is populated with the required entities.

The entityWithIdentifier fallback is retained and still fires for entities added by a concurrent distributed process in the window between the batch fetch and the cursor scan — the intended use of that code path.

Measured on an operation with 4 distinct entities (197ms RTT to DB):
_measurement_requests_cursor_to_pydantic: 3,378ms → 859ms (-2,519ms)

Saving scales linearly with the number of distinct entities touched by the operation: N round-trips collapsed to 1.

…or_to_pydantic Previously, _measurement_requests_cursor_to_pydantic called entityWithIdentifier in a loop for each entity not already in the local cache, issuing one DB round-trip (~240ms) per unique entity. For an operation touching N distinct entities this was an N+1 query pattern. This is via a pattern of initializing the store an then calling measurement_requests_for_operation or related calls which happens with `ado show X operation` It might be expected that self._entities is pre-populated before this method runs. There are two reasons it is not: 1. The bulk-load mechanism is bypassed. The public `entities` property triggers refresh(force_fetch_all_entities=True) when self._entities is empty, loading all entities in one query. But _measurement_requests_- cursor_to_pydantic accesses the private self._entities dict directly, never touching the property, so the lazy-load never fires. 2. Nothing in the construction path populates the cache. SQLSampleStore starts with self._entities = {}. By the time _measurement_requests_cursor_to_pydantic runs, the cache is still empty and every entity lookup falls through to entityWithIdentifier. PyMySQL already buffers the full result set in memory before the Python loop begins, so consuming the cursor up-front with fetchall() adds no network round-trip. The fix uses this to collect all missing entity IDs before any entity is resolved, then issues a single _fetch_entities query for all of them at once regardless of N, ensuring the cache is populated with the required entities. The entityWithIdentifier fallback is retained and still fires for entities added by a concurrent distributed process in the window between the batch fetch and the cursor scan — the intended use of that code path. Measured on an operation with 4 distinct entities (197ms RTT to DB): _measurement_requests_cursor_to_pydantic: 3,378ms → 859ms (-2,519ms) Saving scales linearly with the number of distinct entities touched by the operation: N round-trips collapsed to 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(perf): batch-fetch missing entities in _measurement_requests_cursor_to_pydantic#671

fix(perf): batch-fetch missing entities in _measurement_requests_cursor_to_pydantic#671
michael-johnston wants to merge 1 commit intomainfrom
maj_batch_fetch_missing_entities

michael-johnston commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michael-johnston commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant