-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
When the LastStatCompletedAt heartbeat write to CommandLog fails (e.g., table locked, permissions issue, transient deadlock), the exception propagates into the outer TRY/CATCH block, which marks the stat update as FAILED and increments @consecutive_failures. This happens even when the UPDATE STATISTICS itself completed successfully.
The result: a healthy run on a busy system can falsely trip @MaxConsecutiveFailures and abort, due entirely to heartbeat write failures — not stat update failures.
Example Failure Scenario
UPDATE STATISTICS [dbo].[OrderLines] ([IX_OrderDate]) → SUCCESS (5.2s)
CommandLog heartbeat write → DEADLOCK (victim)
Outer TRY/CATCH catches the deadlock
Marks [IX_OrderDate] as FAILED in accounting
@consecutive_failures increments to 3
@MaxConsecutiveFailures = 3 → procedure aborts
All statistics were successfully updated. The abort is a false positive.
Root Cause
The heartbeat write is inside the same TRY block as the stat update, without its own exception handler. Any failure in the heartbeat write causes the entire stat's TRY block to jump to the CATCH, which cannot distinguish "stat failed" from "stat succeeded, heartbeat failed."
-- CURRENT (simplified)
BEGIN TRY
EXECUTE sp_executesql @current_command; -- UPDATE STATISTICS
-- heartbeat write
UPDATE dbo.CommandLog
SET LastStatCompletedAt = GETDATE()
WHERE ...; -- if this throws, falls to CATCH
END TRY
BEGIN CATCH
SET @stat_succeeded = 0; -- incorrectly marks as failed
SET @consecutive_failures += 1;
END CATCHFix
Wrap the heartbeat write in a nested TRY/CATCH. Heartbeat failure is logged as a warning but does not affect the stat's success/failure accounting.
-- PROPOSED FIX
BEGIN TRY
EXECUTE sp_executesql @current_command; -- UPDATE STATISTICS
-- Heartbeat write in its own scope
BEGIN TRY
UPDATE dbo.CommandLog
SET LastStatCompletedAt = GETDATE()
WHERE ...;
END TRY
BEGIN CATCH
-- Log the heartbeat failure but do NOT rethrow
RAISERROR(
'[sp_StatUpdate] WARNING: Heartbeat write failed for %s.%s — stat update succeeded. Error: %s',
0, 1, @table_name, @stat_name, ERROR_MESSAGE()
) WITH NOWAIT;
-- @stat_succeeded remains 1; @consecutive_failures is NOT incremented
END CATCH
SET @stat_succeeded = 1;
END TRY
BEGIN CATCH
SET @stat_succeeded = 0; -- only genuine stat failures reach here
SET @consecutive_failures += 1;
END CATCHEdge Cases
- If the heartbeat write failure is due to a permissions problem, the warning will repeat for every stat. This is intentional — it makes the misconfiguration visible without aborting the run.
- If the CommandLog table is dropped mid-run, this will emit repeated warnings. Acceptable behavior.
- The nested TRY/CATCH pattern is supported on SQL Server 2005+ and does not require a version gate.
Testing Requirements
- Simulate heartbeat write failure (revoke INSERT on CommandLog mid-run) — confirm stat is marked SUCCESS, consecutive_failures not incremented, warning emitted
- Simulate genuine stat failure (UPDATE STATISTICS on inaccessible table) — confirm stat is marked FAILED, consecutive_failures incremented
- Confirm
@MaxConsecutiveFailuresabort does NOT trigger on heartbeat-only failures