Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ edition = "2021"

[dependencies]
reqwest = { version = "0.12", features = ["json", "rustls-tls-native-roots", "cookies"], default-features = false } # Using rustls-tls-native-roots with cookie support
hyper-util = "0.1" # For HttpInfo — accurate connection tracking (Issue #119)
tokio = { version = "1", features = ["full"] } # "full" includes everything you need for async main
prometheus = "0.13"
hyper = { version = "0.14", features = ["full"] } # For the HTTP server
Expand Down
60 changes: 23 additions & 37 deletions docs/CONNECTION_POOL.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,12 @@ config:
pool:
maxIdlePerHost: 32
idleTimeoutSecs: 30
metricsReuseThresholdMs: 100
```

| Field | Default | Description |
|--------------------------|---------|--------------------------------------------------|
| `maxIdlePerHost` | `32` | Max idle connections per host. Set to `0` to disable pooling. |
| `idleTimeoutSecs` | `30` | Seconds before idle connections are closed. Set to `0` to close immediately. |
| `metricsReuseThresholdMs`| `100` | Latency threshold (ms) for the Prometheus metrics heuristic. Does **not** affect actual connection behavior — only how metrics classify requests as "new" vs "reused". |
| Field | Default | Description |
|------------------|---------|----------------------------------------------------------------------|
| `maxIdlePerHost` | `32` | Max idle connections per host. Set to `0` to disable pooling. |
| `idleTimeoutSecs`| `30` | Seconds before idle connections are closed. Set to `0` to close immediately. |

## Use Case: Force New Connection Per Request

Expand Down Expand Up @@ -180,41 +178,29 @@ will transparently open a new connection when this happens.

## Monitoring Connection Reuse

Prometheus metrics are available on port 9090:

| Metric | Type | Description |
|-----------------------------------------|------------|------------------------------------------|
| `connection_pool_likely_new_total` | Counter | Requests classified as new connections |
| `connection_pool_likely_reused_total` | Counter | Requests classified as reused connections|
| `connection_pool_reuse_rate_percent` | Gauge | Current reuse percentage |
| `connection_pool_requests_total` | Counter | Total requests tracked |
| `connection_pool_max_idle_per_host` | Gauge | Configured max idle setting |
| `connection_pool_idle_timeout_seconds` | Gauge | Configured idle timeout setting |

### Important: Metrics Are Heuristic-Based

The "new" vs "reused" classification uses a **latency heuristic**, not actual
connection state (reqwest does not expose this). Requests slower than
`metricsReuseThresholdMs` (default: 100ms) are classified as "likely new
connection" because a TLS handshake typically adds 50-150ms.

This means:

- Fast targets where TLS completes in <100ms will **undercount** new connections
- Slow targets where reused requests take >100ms will **overcount** new connections

Tune `metricsReuseThresholdMs` in the YAML to match your target's typical TLS
handshake time for more accurate classification. For definitive connection
tracking, check server-side access logs.
Prometheus metrics are available on port 9090. Connection tracking uses
**local TCP port comparison** — each response's local socket address is
checked. A new local port means a new TCP connection was established.
Same port means the connection was reused from the pool. This is
deterministic and accurate at any RPS.

| Metric | Type | Description |
|-------------------------------------|---------|------------------------------------------|
| `connection_pool_new_total` | Counter | Requests that used a new TCP connection |
| `connection_pool_reused_total` | Counter | Requests that reused a pooled connection |
| `connection_pool_reuse_rate_percent`| Gauge | Current reuse percentage |
| `connection_pool_requests_total` | Counter | Total requests tracked |
| `connection_pool_max_idle_per_host` | Gauge | Configured max idle setting |
| `connection_pool_idle_timeout_seconds`| Gauge | Configured idle timeout setting |

### Grafana Queries

**New vs reused connections over time (time series panel):**

| Query | Legend |
|-------------------------------------------------|----------|
| `rate(connection_pool_likely_reused_total[1m])` | Reused |
| `rate(connection_pool_likely_new_total[1m])` | New |
| Query | Legend | Color |
|--------------------------------------------|--------|-------|
| `rate(connection_pool_reused_total[1m])` | Reused | Green |
| `rate(connection_pool_new_total[1m])` | New | Red |

**Reuse rate (single stat panel):**

Expand All @@ -225,5 +211,5 @@ connection_pool_reuse_rate_percent
**Percentage of new connections (single stat panel):**

```promql
connection_pool_likely_new_total / connection_pool_requests_total * 100
connection_pool_new_total / connection_pool_requests_total * 100
```
31 changes: 8 additions & 23 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,6 @@ pub struct Config {
// When Some, these override env-var defaults when building the HTTP client.
pub pool_max_idle_per_host: Option<usize>,
pub pool_idle_timeout_secs: Option<u64>,
pub pool_metrics_reuse_threshold_ms: Option<u64>,
}

/// Helper to get a required environment variable.
Expand Down Expand Up @@ -236,15 +235,10 @@ impl Config {
let auto_disable_percentiles_on_warning =
env_bool("AUTO_DISABLE_PERCENTILES_ON_WARNING", true);

let (pool_max_idle_per_host, pool_idle_timeout_secs, pool_metrics_reuse_threshold_ms) =
match &yaml_config.config.pool {
Some(p) => (
p.max_idle_per_host,
p.idle_timeout_secs,
p.metrics_reuse_threshold_ms,
),
None => (None, None, None),
};
let (pool_max_idle_per_host, pool_idle_timeout_secs) = match &yaml_config.config.pool {
Some(p) => (p.max_idle_per_host, p.idle_timeout_secs),
None => (None, None),
};

let config = Config {
target_url,
Expand All @@ -269,7 +263,6 @@ impl Config {
cluster: ClusterConfig::from_env(),
pool_max_idle_per_host,
pool_idle_timeout_secs,
pool_metrics_reuse_threshold_ms,
};

config.validate()?;
Expand Down Expand Up @@ -337,15 +330,10 @@ impl Config {
let auto_disable_percentiles_on_warning =
env_bool("AUTO_DISABLE_PERCENTILES_ON_WARNING", true);

let (pool_max_idle_per_host, pool_idle_timeout_secs, pool_metrics_reuse_threshold_ms) =
match &yaml_config.config.pool {
Some(p) => (
p.max_idle_per_host,
p.idle_timeout_secs,
p.metrics_reuse_threshold_ms,
),
None => (None, None, None),
};
let (pool_max_idle_per_host, pool_idle_timeout_secs) = match &yaml_config.config.pool {
Some(p) => (p.max_idle_per_host, p.idle_timeout_secs),
None => (None, None),
};

let config = Config {
target_url,
Expand All @@ -370,7 +358,6 @@ impl Config {
cluster: ClusterConfig::from_env(),
pool_max_idle_per_host,
pool_idle_timeout_secs,
pool_metrics_reuse_threshold_ms,
};

config.validate()?;
Expand Down Expand Up @@ -538,7 +525,6 @@ impl Config {
cluster: ClusterConfig::from_env(),
pool_max_idle_per_host: None,
pool_idle_timeout_secs: None,
pool_metrics_reuse_threshold_ms: None,
};

config.validate()?;
Expand Down Expand Up @@ -744,7 +730,6 @@ impl Config {
cluster: ClusterConfig::for_testing(),
pool_max_idle_per_host: None,
pool_idle_timeout_secs: None,
pool_metrics_reuse_threshold_ms: None,
}
}

Expand Down
Loading
Loading