Skip to content

feat(health): dependency latency budget healthcheck for RPC and Horizon (#848)#964

Merged
Cedarich merged 6 commits into
Pulsefy:mainfrom
amankoli09:feat/848-dependency-latency-budget-healthcheck
Jun 29, 2026
Merged

feat(health): dependency latency budget healthcheck for RPC and Horizon (#848)#964
Cedarich merged 6 commits into
Pulsefy:mainfrom
amankoli09:feat/848-dependency-latency-budget-healthcheck

Conversation

@amankoli09

Copy link
Copy Markdown
Contributor

Summary

Closes #848

Adds health signals that fail when testnet dependency latency exceeds
acceptable thresholds, satisfying all four acceptance criteria:

Criterion How it's met
Health endpoint includes latency budget checks GET /health now includes a latencyBudget field; new GET /health/latency endpoint exposes just the latency report
Separates hard-down from degraded states Three-state model: ok / degraded / hard_down — only hard_down triggers HTTP 503
Thresholds are configurable Four env vars (HEALTH_HORIZON_LATENCY_DEGRADED_MS, HEALTH_HORIZON_LATENCY_HARD_DOWN_MS, HEALTH_SOROBAN_RPC_LATENCY_DEGRADED_MS, HEALTH_SOROBAN_RPC_LATENCY_HARD_DOWN_MS) with safe defaults
Useful for Vercel/preview smoke checks GET /health/latency returns 200 on ok/degraded, 503 on hard_down — ideal for simple smoke check scripts

What changed

New files

  • latency-budget.config.ts — reads threshold env vars, exports a typed
    LatencyBudgetConfig object.
  • latency-budget.health.service.ts — probes Horizon (HTTP GET) and
    Soroban RPC (getHealth JSON-RPC) concurrently, measures round-trip
    latency, classifies each as ok / degraded / hard_down.
  • latency-budget.health.service.spec.ts — unit tests: ok path, connection
    failures, error capture, response shape, state rollup.

Modified files

  • health.service.ts — injects LatencyBudgetHealthService; runs probe
    in parallel with existing checks; hard_down elevates status to error
    (HTTP 503); degraded keeps HTTP 200 but sets summary to degraded;
    latencyBudget object included in every health report.
  • health.controller.ts — new GET /health/latency endpoint.
  • health.module.ts — registers LatencyBudgetHealthService.
  • .env.example — documents all four threshold env vars with defaults.
  • health.service.spec.ts — adds LatencyBudgetHealthService mock;
    two new integration test cases (hard_down → 503, degraded → 200).

HTTP status semantics

overallState GET /health status GET /health/latency status
ok 200 200
degraded 200 200
hard_down 503 503

Default thresholds

Dependency degradedMs hardDownMs
Horizon 1000 ms 4000 ms
Soroban RPC 1500 ms 5000 ms

…RPC (Pulsefy#848)

Add dependency latency budget signals that classify testnet RPC and
Horizon response times into ok / degraded / hard_down states.

Changes
-------
* latency-budget.config.ts
  - Reads per-dependency thresholds from HEALTH_HORIZON_LATENCY_DEGRADED_MS,
    HEALTH_HORIZON_LATENCY_HARD_DOWN_MS, HEALTH_SOROBAN_RPC_LATENCY_DEGRADED_MS,
    HEALTH_SOROBAN_RPC_LATENCY_HARD_DOWN_MS env vars (configurable).
  - Sensible defaults: Horizon degraded=1000ms / hard-down=4000ms,
    RPC degraded=1500ms / hard-down=5000ms.

* latency-budget.health.service.ts
  - Probes Horizon root via HTTP GET and Soroban RPC via JSON-RPC getHealth
    concurrently.
  - Measures round-trip latency and classifies each result as ok, degraded,
    or hard_down using the configured thresholds.
  - Connection failures are always hard_down; latency >= hard-down threshold
    is also hard_down; latency between thresholds is degraded.
  - Overall state is the worst state across all dependencies.

* health.service.ts
  - Injects LatencyBudgetHealthService; runs latency probe in parallel with
    existing checks.
  - hard_down latency → overall status=error (HTTP 503, summary=down).
  - degraded latency  → overall status=ok  (HTTP 200, summary=degraded).
  - latencyBudget object included in every health report response.

* health.controller.ts
  - New GET /health/latency endpoint returning just the latency budget report.
  - Returns HTTP 503 on hard_down, HTTP 200 otherwise — suitable for
    Vercel/preview smoke checks.

* health.module.ts
  - Registers LatencyBudgetHealthService.

* .env.example
  - Documents all four HEALTH_* threshold env vars with their defaults.

Tests
-----
* latency-budget.health.service.spec.ts — full unit suite covering ok path,
  connection failures, error message capture, response shape.
* health.service.spec.ts — extended with LatencyBudgetHealthService mock;
  new cases for hard_down and degraded latency affecting overall status.

Closes Pulsefy#848
@drips-wave

drips-wave Bot commented Jun 27, 2026

Copy link
Copy Markdown

@amankoli09 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@Cedarich

Copy link
Copy Markdown
Contributor

@amankoli09 fix workflow

@amankoli09

Copy link
Copy Markdown
Contributor Author

@Cedarich Please approve the workflow

@Cedarich

Copy link
Copy Markdown
Contributor

@amankoli09 please fix workflow

@Cedarich

Copy link
Copy Markdown
Contributor

@amankoli09

@amankoli09

Copy link
Copy Markdown
Contributor Author

@Cedarich i have fix please do check

@Cedarich Cedarich merged commit 54aa604 into Pulsefy:main Jun 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backend: Dependency latency budget healthcheck for RPC and Horizon

2 participants