feat: add slow request tracking, request size histogram, and upstream health gauge by abtreece · Pull Request #7 · abtreece/lua-resty-ngxstats

abtreece · 2026-01-16T01:41:15Z

Summary

Add slow request counter (nginx_server_zone_slow_requests_total) - counts requests exceeding configurable threshold (default 1 second)
Add request size histogram (nginx_server_zone_request_length_bytes_bucket) - tracks request body size distribution with buckets from 100B to 100MB
Add upstream health gauge (nginx_upstream_health) - provides calculated health status (1=healthy, 0=unhealthy) based on failure rate threshold (default 10%)

Changes

lib/resty/ngxstats/common.lua: Added constants (REQUEST_SIZE_BUCKETS, SLOW_REQUEST_THRESHOLD, UPSTREAM_HEALTH_FAILURE_THRESHOLD) and calculate_upstream_health() function
lib/resty/ngxstats/log.lua: Refactored to extract record_ssl_metrics() and record_upstream_metrics() helper functions, added new metric collection
lib/resty/ngxstats/prometheus.lua: Added metric definitions and handlers for new metrics
spec/common_spec.lua: Added 11 new unit tests for constants and health calculation
spec/prometheus_spec.lua: Added 3 new unit tests for Prometheus format output
README.md: Updated metrics table and feature list
CHANGELOG.md: Documented new features

Test plan

All 76 unit tests pass
All 55 integration tests pass
Linting passes with 0 warnings/errors
CI pipeline passes

… health gauge - Add slow request counter (requests exceeding 1s threshold) - Add request size histogram with buckets from 100B to 100MB - Add upstream health gauge (1=healthy, 0=unhealthy based on failure rate) - Refactor log.lua to extract helper functions for SSL and upstream metrics - Add unit tests for new constants and functions - Update README with new metrics documentation - Update CHANGELOG with new features

Copilot

Pull request overview

This PR adds three new monitoring capabilities to the ngxstats library: slow request tracking, request size histograms, and upstream health status calculation.

Changes:

Introduces slow request counter to track requests exceeding 1 second threshold
Adds request size histogram with buckets ranging from 100 bytes to 100MB
Implements upstream health gauge based on failure rate (10% threshold)

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
lib/resty/ngxstats/common.lua	Added constants for new metrics and health calculation function
lib/resty/ngxstats/log.lua	Refactored metric recording into helper functions and integrated new metrics
lib/resty/ngxstats/prometheus.lua	Added Prometheus format definitions for new metrics
spec/common_spec.lua	Added test coverage for new constants and health calculation logic
spec/prometheus_spec.lua	Added tests verifying Prometheus output format for new metrics
README.md	Updated documentation with new metrics and feature comparison
CHANGELOG.md	Documented new features in unreleased section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-16T01:42:01Z

+    if proxy_host ~= "" then
+        local health = common.calculate_upstream_health(stats, upstream_name)
+        common.update_num(stats, common.key({'upstreams', upstream_name, 'health'}), health)
    end


The health calculation should be scoped within the upstream_response_time conditional block (lines 152-161) to ensure it only executes when upstream metrics are being recorded. Currently, it relies on upstream_name which is only set when an upstream exists, but the logic would be clearer if this block was inside the upstream metrics section.

Copilot · 2026-01-16T01:42:02Z

+        it("should return 1 (healthy) when exactly at threshold", function()
+            -- 90 requests, 10 failures = 10% failure rate (at threshold, not above)
+            mock_stats:set("upstreams:edge:requests", 90)
+            mock_stats:set("upstreams:edge:failures", 10)
+
+            local health = common.calculate_upstream_health(mock_stats, "edge")
+            assert.equals(1, health)
+        end)


The test comment indicates the failure rate is 'at threshold, not above', but with 90 requests and 10 failures, the total is 100 and the failure rate is 10%, which equals the threshold (0.1). The implementation uses > on line 267 of common.lua, so this test correctly expects healthy (1). However, consider adding a test case with 91 requests and 10 failures (9.9% failure rate) to verify the boundary condition more explicitly.

Copilot · 2026-01-16T01:42:02Z

+    local requests = stats:get(requests_key) or 0
+    local failures = stats:get(failures_key) or 0
+
+    -- Need at least some requests to determine health


The comment says 'Need at least some requests to determine health' but the code also returns healthy (1) when there is no data. Consider clarifying the comment to explicitly state the assumption, such as: 'Assume healthy if no data is available yet'.

Suggested change

-- Need at least some requests to determine health

-- Need at least some requests to determine health; assume healthy if no data is available yet

Copilot AI review requested due to automatic review settings January 16, 2026 01:41

Copilot AI reviewed Jan 16, 2026

View reviewed changes

abtreece merged commit d1d770b into main Jan 16, 2026
8 checks passed

abtreece deleted the feature/enhanced-metrics branch January 16, 2026 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add slow request tracking, request size histogram, and upstream health gauge#7

feat: add slow request tracking, request size histogram, and upstream health gauge#7
abtreece merged 1 commit into
mainfrom
feature/enhanced-metrics

abtreece commented Jan 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	-- Need at least some requests to determine health
	-- Need at least some requests to determine health; assume healthy if no data is available yet

Conversation

abtreece commented Jan 16, 2026

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants