-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
At lines ~5517–5530, the procedure uses a cursor (phase_loss_cursor) to emit one RAISERROR message per statistic that lost its properties data between Phase 1 and Phase 2 of staged discovery. This is a classic case where a cursor is used to iterate over a result set and emit messages sequentially — a pattern that makes sense in a procedural language, but in T-SQL can be replaced by a set-based approach that eliminates the cursor entirely.
Additionally, at lines ~6962–7033, a join_cursor iterates over distinct database names to run a per-database Query Store join-pattern analysis. And at lines ~7101–7197, a temporal_cursor iterates over distinct databases with temporal tables. Both cursors perform a pattern that is structurally equivalent: "for each database in a set, execute a dynamic SQL block." This is the canonical case where sys.sp_executesql with a batched set-based approach, or a pre-built multi-database union query, would eliminate cursor overhead.
Evidence
-- Lines 5517–5530: cursor for debug message emission
DECLARE phase_loss_cursor CURSOR LOCAL FAST_FORWARD FOR
SELECT N' Phase 1→2 loss: ' + QUOTENAME(schema_name) + ...
FROM #stat_candidates WHERE modification_counter IS NULL
ORDER BY schema_name, table_name, stat_name;
OPEN phase_loss_cursor;
FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
WHILE @@FETCH_STATUS = 0
BEGIN
RAISERROR(@phase_loss_msg, 10, 1) WITH NOWAIT;
FETCH NEXT FROM phase_loss_cursor INTO @phase_loss_msg;
END;The phase_loss_cursor is used solely to emit messages. A set-based alternative using FOR XML to concatenate and emit a single formatted message (or STRING_AGG on SQL 2017+) would produce equivalent debug output in one RAISERROR call with zero cursor overhead.
For join_cursor and temporal_cursor: these iterate over a set of database names to run per-database dynamic SQL. The set-based rewrite uses a WHILE EXISTS pattern (already used elsewhere in the procedure for the database iteration loop) or consolidates the dynamic SQL into a single parameterized batch that handles all databases in one sp_executesql call.
Proposed Fix
phase_loss_cursor replacement (SQL 2017+):
-- Replace cursor with STRING_AGG (SQL 2017+) or FOR XML (SQL 2016)
DECLARE @loss_summary nvarchar(max);
SELECT @loss_summary = STRING_AGG(
N' Phase 1→2 loss: ' + QUOTENAME(schema_name) + N'.' + QUOTENAME(table_name) + N'.' + QUOTENAME(stat_name),
NCHAR(13) + NCHAR(10)
) FROM #stat_candidates WHERE modification_counter IS NULL;
IF @loss_summary IS NOT NULL
RAISERROR(@loss_summary, 10, 1) WITH NOWAIT;For join_cursor and temporal_cursor: the set-based equivalent is to accumulate the per-database dynamic SQL fragments into a single nvarchar(max) variable before executing — one EXEC call per database batch rather than one cursor iteration per database.
Note: RAISERROR with nvarchar strings has a 2,047 character limit per call. The STRING_AGG approach should chunk output at 2,047 characters if more than ~20 stats are in the loss set.
Impact
Cursors inside dynamic SQL that is itself inside a maintenance loop multiply the cursor overhead with the number of databases and the number of qualifying statistics. On a server with 20 databases and 5,000 statistics, these cursors are a measurable overhead source. More importantly, cursors represent a category error in T-SQL thinking: the operation "emit one message per row" is expressible as a set operation (string aggregation), and the set-based form is always preferable when a correct equivalent exists. The procedural form should be reserved for cases where truly row-by-row state is required — which emitting debug messages is not.