Skip to content

BUG (P1): RESAMPLE_PERSIST path has no histogram quality floor — silently builds degraded histograms on low-sample stats #329

@nanoDBA

Description

@nanoDBA

Summary

The RESAMPLE_PERSIST path (lines 6133–6148) honors a persisted sample rate via RESAMPLE unconditionally — there is no minimum sample size floor check. If the persisted sample rate is very low (e.g., 0.01% on a 10B-row table), sp_StatUpdate will RESAMPLE using a rate that produces fewer than 200 usable histogram steps without any warning or override.

The values needed to compute the floor (@current_row_count and @current_persisted_sample_percent) are both in scope at the decision point, but the product row_count × (percent / 100.0) (absolute sampled rows) is never computed, and no floor parameter exists.


Impact

  • Silent histogram degradation. A RESAMPLE_PERSIST call with a low persisted sample rate on a large table can produce a histogram with fewer than 200 meaningful steps — CE range estimates can be off by 10× or more.
  • Invisible in query plans. When histogram steps are built from fewer than 50–100 sampled rows per step, CE range estimate errors reach 10×+, which directly corrupts row goal scale factors. The resulting plan failure (Nested Loops + Seek processing far more rows than expected) is invisible in EstimatedRowsWithoutRowGoal because the histogram estimate itself is wrong — the plan looks correct but uses a corrupted cardinality baseline.
  • RESAMPLE honors a rate that may have been calibrated at a different table scale. A persisted rate of 1% calibrated at 100M rows (1M sampled) becomes 1% at 10B rows (still 1M sampled) — but also 1% at 1M rows (10K sampled — well below any useful histogram quality floor).

Root Cause

-- CURRENT (v2.3) — lines 6133–6148
-- @current_row_count is populated: ✅
-- @current_persisted_sample_percent is populated: ✅
-- absolute_sampled_rows = row_count × (percent/100.0): ❌ NEVER COMPUTED
-- @PersistSampleMinRows parameter: ❌ DOES NOT EXIST — floor check never written

ELSE IF @effective_sample_percent IS NULL
AND     @current_persisted_sample_percent IS NOT NULL
AND     @is_long_running_stat = 0
BEGIN
    SELECT
        @with_clause  = N'RESAMPLE',
        @has_with_option = 1,
        @sample_source = N'RESAMPLE_PERSIST';
    -- No floor check. RESAMPLE fires unconditionally regardless of resulting sample size.
END

SQL Server's own auto-sample algorithm converges to approximately 1,000,000 absolute sampled rows for large tables — this is the closest available engineering proxy for "adequate for 200-step histogram construction." A 500K floor is the right order of magnitude; the v2.4 default should be 1,000,000 to align with SQL Server's own threshold.


Fix

1. Compute @absolute_sampled_rows before the floor check

Insert the computation before the IF @resample_persist = 1 block:

-- Compute absolute sampled rows for histogram quality floor check
DECLARE @absolute_sampled_rows BIGINT = NULL;

IF @current_row_count IS NOT NULL
   AND @current_persisted_sample_percent IS NOT NULL
   AND @current_persisted_sample_percent > 0
BEGIN
    SET @absolute_sampled_rows = CAST(
        @current_row_count * (@current_persisted_sample_percent / 100.0)
        AS BIGINT
    );
END

2. Use it in the existing floor comparison (no structural change needed)

The IF @resample_persist = 1 AND ... AND @absolute_sampled_rows >= @PersistSampleMinRows guard then works as intended.

3. Add @PersistSampleMinRows parameter (new)

@PersistSampleMinRows BIGINT = 1000000    -- minimum absolute sampled rows for RESAMPLE_PERSIST path

The default of 1,000,000 (1M rows) aligns with SQL Server's own auto-sample algorithm convergence threshold — the internal engineering reference for "enough rows to produce a reliable 200-step histogram." When @absolute_sampled_rows < @PersistSampleMinRows, fall back to an explicit sample rate rather than RESAMPLE.

4. Log to CommandLog ExtendedInfo

Add persisted_sampled_rows as an absolute figure to the CommandLog ExtendedInfo XML column. The current persisted_sample_percent alone is uninterpretable without the denominator — a DBA reviewing CommandLog cannot evaluate histogram quality without this.

<ExtendedInfo>
  ...
  <PersistedSamplePercent>0.10</PersistedSamplePercent>
  <PersistedSampledRows>10000000</PersistedSampledRows>  <!-- NEW -->
</ExtendedInfo>

Testing Requirements

  • Validate @absolute_sampled_rows is computed before the RESAMPLE_PERSIST floor check
  • Validate floor check fires correctly for tables below 1M sampled rows
  • Validate floor check passes for tables at or above 1M sampled rows
  • Validate NULL-safe handling (both inputs NULL → skip floor check, not error)
  • Test with @Execute = 'N' (dry-run) to verify variable is set even without execution
  • Multi-version matrix: no version gate required (pure T-SQL arithmetic)

References

  • Internal analysis: PERSIST_SAMPLE_PERCENT adequacy floor
  • Claim-194: Code gap — absolute_sampled_rows never computed (❌ Bug Fix)
  • Claim-193: SQL Server auto-sample convergence ~1M rows (engineering reference)
  • Claim-196: CE downstream impact — row goal backfire invisible in EstimatedRowsWithoutRowGoal
  • Related: issue-p2b-qs-forced-plan-always-on.md (plan stability interaction)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp1-criticalBug or defect that causes incorrect behavior, data loss, or silent failure

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions