Skip to content

phase_loss_cursor, join_cursor, and temporal_cursor use row-by-row iteration for operations expressible as set-based string aggregation — procedural anti-pattern #317

@nanoDBA

Description

@nanoDBA

Problem

At lines ~5517–5530, the procedure uses a cursor (phase_loss_cursor) to emit one RAISERROR message per statistic that lost its properties data between Phase 1 and Phase 2 of staged discovery. This is a classic case where a cursor is used to iterate over a result set and emit messages sequentially — a pattern that makes sense in a procedural language, but in T-SQL can be replaced by a set-based approach that eliminates the cursor entirely.

Additionally, at lines ~6962–7033, a join_cursor iterates over distinct database names to run a per-database Query Store join-pattern analysis. And at lines ~7101–7197, a temporal_cursor iterates over distinct databases with temporal tables. Both cursors perform a pattern that is structurally equivalent: "for each database in a set, execute a dynamic SQL block." This is the canonical case where sys.sp_executesql with a batched set-based approach, or a pre-built multi-database union query, would eliminate cursor overhead.

Evidence

-- Lines 5517–5530: cursor for debug message emission
DECLARE phase_loss_cursor CURSOR LOCAL FAST_FORWARD FOR
    SELECT N'  Phase 1→2 loss: ' + QUOTENAME(schema_name) + ... 
    FROM #stat_candidates WHERE modification_counter IS NULL
    ORDER BY schema_name, table_name, stat_name;
OPEN phase_loss_cursor;
FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
WHILE @@FETCH_STATUS = 0
BEGIN
    RAISERROR(@phase_loss_msg, 10, 1) WITH NOWAIT;
    FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
END;

The phase_loss_cursor is used solely to emit messages. A set-based alternative using FOR XML to concatenate and emit a single formatted message (or STRING_AGG on SQL 2017+) would produce equivalent debug output in one RAISERROR call with zero cursor overhead.

For join_cursor and temporal_cursor: these iterate over a set of database names to run per-database dynamic SQL. The set-based rewrite uses a WHILE EXISTS pattern (already used elsewhere in the procedure for the database iteration loop) or consolidates the dynamic SQL into a single parameterized batch that handles all databases in one sp_executesql call.

Proposed Fix

phase_loss_cursor replacement (SQL 2017+):

-- Replace cursor with STRING_AGG (SQL 2017+) or FOR XML (SQL 2016)
DECLARE @loss_summary nvarchar(max);
SELECT @loss_summary = STRING_AGG(
    N'  Phase 1→2 loss: ' + QUOTENAME(schema_name) + N'.' + QUOTENAME(table_name) + N'.' + QUOTENAME(stat_name),
    NCHAR(13) + NCHAR(10)
) FROM #stat_candidates WHERE modification_counter IS NULL;

IF @loss_summary IS NOT NULL
    RAISERROR(@loss_summary, 10, 1) WITH NOWAIT;

For join_cursor and temporal_cursor: the set-based equivalent is to accumulate the per-database dynamic SQL fragments into a single nvarchar(max) variable before executing — one EXEC call per database batch rather than one cursor iteration per database.

Note: RAISERROR with nvarchar strings has a 2,047 character limit per call. The STRING_AGG approach should chunk output at 2,047 characters if more than ~20 stats are in the loss set.

Impact

Cursors inside dynamic SQL that is itself inside a maintenance loop multiply the cursor overhead with the number of databases and the number of qualifying statistics. On a server with 20 databases and 5,000 statistics, these cursors are a measurable overhead source. More importantly, cursors represent a category error in T-SQL thinking: the operation "emit one message per row" is expressible as a set operation (string aggregation), and the set-based form is always preferable when a correct equivalent exists. The procedural form should be reserved for cases where truly row-by-row state is required — which emitting debug messages is not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    p4-deferredValid but deferred to future releasepersona-reviewIssue generated via SQL SME persona code reviewtriage-reviewedReviewed and triaged 2026-03-11

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions