-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Overview
The FinOps tab already collects solid foundational data — utilization scoring, idle databases, index cleanup, storage analysis, server properties, and resource consumption. The next phase is turning that raw data into actionable cost-saving recommendations that help DBAs and IT managers justify budget decisions.
This issue tracks the full roadmap from per-server optimizations through estate-wide consolidation analysis.
Phase 1: Per-Server Optimizations (SHIPPED)
- Enterprise feature usage audit — query
sys.dm_db_persisted_sku_featuresper database to identify servers paying for Enterprise where no database uses Enterprise-only features - CPU right-sizing score — compare licensed core count against sustained CPU utilization trends to flag servers where cores are significantly over-provisioned
- Memory right-sizing score — compare allocated max server memory against actual buffer pool + stolen memory usage to flag reclaimable RAM
- Unused index cost quantification — extend existing index cleanup to show wasted storage (GB), write I/O overhead, and estimated backup time impact for unused/duplicate indexes
- Compression savings estimator — flag large uncompressed heaps/indexes with estimated space savings (page vs. row compression candidates)
- Dormant database detection (enhanced) — extend existing idle database detection with estimated cost impact (backup storage, AG traffic, DBCC overhead, license implications)
- Dev/test workload detection — flag servers with production-edition licenses but development usage patterns (low connection counts, sporadic activity, database naming conventions)
Phase 2: Edition & Licensing (SHIPPED)
- Enterprise → Standard downgrade report — per-server report showing which Enterprise features are actually in use, which databases use them, and what would need to change to move to Standard
- License cost impact estimates — list pricing differential so recommendations show estimated dollar savings
Phase 3: Hardware & Maintenance Optimization (SHIPPED)
- Maintenance window efficiency — flag maintenance jobs with excessive duration relative to their historical average
- VM right-sizing recommendations — based on sustained CPU/memory trends, prescribe specific target reductions (e.g., "reduce from 32 to 16 cores based on P95 CPU of 22%")
- Storage tier optimization — flag databases with consistently low IO latency (<5ms reads) that don't need premium storage
- Reserved capacity candidates — servers with consistent, predictable utilization patterns (low coefficient of variation) that would benefit from reserved pricing over pay-as-you-go
Phase 4: High Impact Analysis (SHIPPED)
- High Impact Queries tab — 80/20 analysis across CPU, duration, reads, writes, memory, and executions. PERCENT_RANK scoring with composite impact score.
Phase 5: Test Infrastructure (SHIPPED)
- HighImpactScorer extraction — scoring logic in testable static methods
- FinOps test scenarios — 6 DuckDB-seeded scenarios for deterministic validation of recommendation engine output
- FinOps xunit tests — 16 scenario-based tests validating recommendations and scoring
- HealthCalculator + PercentRank tests — 57 unit tests with adversarial inputs, found and fixed PercentRank >1.0 bug
Phase 6: UI Polish (SHIPPED)
- Column filters on all FinOps grids — Dashboard (10 grids) and Lite Optimization (5 grids)
- Query preview with full-text tooltips — truncated in row, full text on hover
- Per-server monthly cost — budget field on server config, proportional cost attribution
- Health scores and license warnings — composite 0-100 score, Standard Edition boundary detection
- Storage Growth query fix — fallback to oldest snapshot when <7d history
- Empty grid scrollbar fix — disabled horizontal scroll on all FinOps DataGrids
Future (post-2.3)
These items need more design work, new collectors, or cross-server architecture changes.
- Consolidation candidate scoring — identify groups of underutilized servers whose combined workloads could fit on fewer instances. Must account for version compatibility, edition requirements, and platform (cloud vs on-prem). Cannot consolidate across major version downgrades or edition downgrades where features are in use.
- Workload complementarity analysis — find servers with complementary peak usage windows via hourly CPU time-series correlation. Natural pairs for consolidation. Same version/edition/platform constraints apply.
- Postgres migration compatibility score — feature dependency scan, T-SQL compatibility analysis. Requires: stored procedure/function text scanning.
Design Notes
- All recommendations are data-driven from DMVs and system catalog views we already collect
- Recommendations include confidence levels (high/medium/low) based on observation window length
- Where possible, savings are quantified as proportional share of the per-server monthly budget set by the user
- Both Lite (DuckDB) and Dashboard (SQL Server) surface recommendations
- Dollar estimates use the per-server monthly cost model — user tags each server with its budget, we show proportional attribution
Related
- Existing FinOps tab infrastructure (utilization scoring, idle DBs, index cleanup, storage analysis)
- Server properties collector (daily, 365-day retention)
- Database size stats collector (hourly, 90-day retention)
- Running jobs collector (per schedule, with avg/p95 duration tracking)
- High Impact Queries tab (query_stats-based 80/20 analysis)
- FinOps test scenarios in TestDataSeeder
- 73 xunit tests (FinOpsTests + HealthCalculatorTests)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels