phase_loss_cursor, join_cursor, and temporal_cursor use row-by-row iteration for operations expressible as set-based string aggregation — procedural anti-pattern

## Problem

At lines ~5517–5530, the procedure uses a cursor (`phase_loss_cursor`) to emit one `RAISERROR` message per statistic that lost its properties data between Phase 1 and Phase 2 of staged discovery. This is a classic case where a cursor is used to iterate over a result set and emit messages sequentially — a pattern that makes sense in a procedural language, but in T-SQL can be replaced by a set-based approach that eliminates the cursor entirely.

Additionally, at lines ~6962–7033, a `join_cursor` iterates over distinct database names to run a per-database Query Store join-pattern analysis. And at lines ~7101–7197, a `temporal_cursor` iterates over distinct databases with temporal tables. Both cursors perform a pattern that is structurally equivalent: "for each database in a set, execute a dynamic SQL block." This is the canonical case where `sys.sp_executesql` with a batched set-based approach, or a pre-built multi-database union query, would eliminate cursor overhead.

## Evidence

```sql
-- Lines 5517–5530: cursor for debug message emission
DECLARE phase_loss_cursor CURSOR LOCAL FAST_FORWARD FOR
    SELECT N'  Phase 1→2 loss: ' + QUOTENAME(schema_name) + ... 
    FROM #stat_candidates WHERE modification_counter IS NULL
    ORDER BY schema_name, table_name, stat_name;
OPEN phase_loss_cursor;
FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
WHILE @@FETCH_STATUS = 0
BEGIN
    RAISERROR(@phase_loss_msg, 10, 1) WITH NOWAIT;
    FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
END;
```

The `phase_loss_cursor` is used solely to emit messages. A set-based alternative using `FOR XML` to concatenate and emit a single formatted message (or `STRING_AGG` on SQL 2017+) would produce equivalent debug output in one `RAISERROR` call with zero cursor overhead.

For `join_cursor` and `temporal_cursor`: these iterate over a set of database names to run per-database dynamic SQL. The set-based rewrite uses a `WHILE EXISTS` pattern (already used elsewhere in the procedure for the database iteration loop) or consolidates the dynamic SQL into a single parameterized batch that handles all databases in one `sp_executesql` call.

## Proposed Fix

**phase_loss_cursor replacement (SQL 2017+)**:
```sql
-- Replace cursor with STRING_AGG (SQL 2017+) or FOR XML (SQL 2016)
DECLARE @loss_summary nvarchar(max);
SELECT @loss_summary = STRING_AGG(
    N'  Phase 1→2 loss: ' + QUOTENAME(schema_name) + N'.' + QUOTENAME(table_name) + N'.' + QUOTENAME(stat_name),
    NCHAR(13) + NCHAR(10)
) FROM #stat_candidates WHERE modification_counter IS NULL;

IF @loss_summary IS NOT NULL
    RAISERROR(@loss_summary, 10, 1) WITH NOWAIT;
```

For `join_cursor` and `temporal_cursor`: the set-based equivalent is to accumulate the per-database dynamic SQL fragments into a single `nvarchar(max)` variable before executing — one `EXEC` call per database batch rather than one cursor iteration per database.

Note: `RAISERROR` with nvarchar strings has a 2,047 character limit per call. The `STRING_AGG` approach should chunk output at 2,047 characters if more than ~20 stats are in the loss set.

## Impact

Cursors inside dynamic SQL that is itself inside a maintenance loop multiply the cursor overhead with the number of databases and the number of qualifying statistics. On a server with 20 databases and 5,000 statistics, these cursors are a measurable overhead source. More importantly, cursors represent a category error in T-SQL thinking: the operation "emit one message per row" is expressible as a set operation (string aggregation), and the set-based form is always preferable when a correct equivalent exists. The procedural form should be reserved for cases where truly row-by-row state is required — which emitting debug messages is not.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phase_loss_cursor, join_cursor, and temporal_cursor use row-by-row iteration for operations expressible as set-based string aggregation — procedural anti-pattern #317

Problem

Evidence

Proposed Fix

Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

phase_loss_cursor, join_cursor, and temporal_cursor use row-by-row iteration for operations expressible as set-based string aggregation — procedural anti-pattern #317

Description

Problem

Evidence

Proposed Fix

Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions