HTTP backend requests are capped by a hardcoded 120s client timeout, ignoring larger configured tool budgets

## Summary

We are seeing long-running HTTP MCP backend tool calls fail at around 120 seconds even when the gateway and caller are configured to allow much longer execution.

This appears to be caused by a hardcoded overall HTTP client timeout in the HTTP backend transport path, which is distinct from the documented and configurable gateway `toolTimeout` and per-server `connect_timeout`.

In practice, this means:

- `toolTimeout` can be set higher, but backend HTTP tool calls still fail around 120s.
- `connect_timeout` is not sufficient because it only covers transport setup / connect behavior.
- Callers see transport errors like `Client.Timeout exceeded while awaiting headers` even though the backend may still be processing the request.

## Observed behavior

From a real workflow using an HTTP MCP backend:

- the backend tool call ran long enough to exceed the transport timeout
- the caller failed with an HTTP client timeout
- the backend then treated the request as disconnected and canceled the work

Representative error:

```text
MCP error -32005: Post "http://host.docker.internal:8000/mcp": context deadline exceeded
(Client.Timeout exceeded while awaiting headers)
```

This was initially confusing because the surrounding system had larger time budgets configured, but the failure still occurred near the transport layer.

## Why this looks like an mcpg issue

There are three different timeout concepts involved:

1. Gateway `toolTimeout`
   - This is documented and configurable.
   - Default appears to be 60s.

2. Per-server HTTP `connect_timeout`
   - This is documented and configurable.
   - It applies to HTTP transport establishment / fallback attempts.

3. Overall HTTP request timeout for HTTP backends
   - This appears to be hardcoded to 120s in the HTTP client used by `NewHTTPConnection`.
   - This is the timeout that seems to terminate long-running backend tool calls.

The key problem is that item 3 is currently the real ceiling for long-running HTTP tool calls, regardless of larger configured `toolTimeout` values.

## Relevant implementation details

In `NewHTTPConnection`, the HTTP client is created with a fixed overall timeout:

```go
httpClient := &http.Client{
    Timeout: 120 * time.Second, // Overall request timeout
    Transport: &http.Transport{
        DialContext: (&net.Dialer{
            Timeout: connectTimeout,
        }).DialContext,
        ...
        ResponseHeaderTimeout: connectTimeout,
    },
}
```

There is already explicit config support for:

- gateway `toolTimeout`
- server `connect_timeout`

And the code/comments already distinguish `connect_timeout` from the HTTP client's overall timeout.

There are also tests that explicitly acknowledge the 120s client timeout, for example an integration timeout test comment that says it is skipped because "the HTTP client has a 120s timeout".

That makes this look like current transport behavior rather than caller-side misconfiguration.

## Why `connect_timeout` is not the fix

The docs and code indicate that `connect_timeout` is for transport setup and fallback attempts:

- streamable HTTP connect
- SSE connect
- plain JSON fallback connection behavior

It is not an end-to-end execution timeout for a long-running `tools/call` request after the transport is already established.

So increasing `connect_timeout` does not address the main issue for long-running HTTP tool execution.

## Impact

This affects any HTTP MCP backend that can legitimately take more than about 120 seconds to return a result, including:

- large-repo semantic/query backends
- indexing/query services
- long-running analysis tools
- backends that stream late or do significant preprocessing before sending headers

The net effect is that:

- callers cannot rely on configured `toolTimeout` for HTTP backends
- long-running tools fail with transport-level timeout errors
- backend services may continue work until they notice the disconnect
- diagnosing the problem is difficult because the failure looks like a backend/tool issue, but the hard limit is actually in the gateway transport

## Example failure

Example workflow run:

- <https://github.com/github/pull-requests/actions/runs/24476270723>

This run is useful because it shows that:

- the repo-mind HTTP backend started successfully
- the MCP gateway connected successfully to the backend
- the later failure happened during the actual backend tool call, not during startup

Relevant evidence from the run:

```text
✓ repo-mind: connected
⏱️  TIMING: Server check for repo-mind took 60ms
```

```text
✗ search (MCP: repo-mind) · 401 unauthorized pull request public repository signed out unauthentica…
  └ MCP server 'repo-mind': McpError: MCP error -32005: calling "tools/call": sending "tools/call":
    rejected by transport: Post "http://host.docker.internal:8000/mcp": context deadline exceeded
    (Client.Timeout exceeded while awaiting headers)
```

And from the MCP gateway summary for the same run:

```text
- 🔍 rpc **repo-mind**→`tools/call` `search`
- 🔍 rpc **repo-mind**←`resp` ⚠️`calling "tools/call": sending "tools/call": rejected by transport: Post "http://host.docker.internal:8000/mcp": context deadline exceeded (Client.Timeout exceeded while awaiting headers)`
```

This is the important part: the backend was already connected, tools were already registered, and the failure still occurred during `tools/call`. That points at the execution/request timeout path, not the connection timeout path.

## Expected behavior

One of the following should be true:

1. The HTTP backend request timeout should be derived from gateway `toolTimeout`.
   - If `toolTimeout` is 600s, the underlying HTTP request should be allowed to run that long.

2. The HTTP backend request timeout should be separately configurable.
   - For example, a per-server `request_timeout` or gateway-level `httpRequestTimeout`.

3. At minimum, the effective timeout behavior should be documented clearly.
   - If a hardcoded 120s limit is intended, that should be explicit in docs because it overrides the practical usefulness of larger `toolTimeout` values for HTTP backends.

## Actual behavior

HTTP backends appear to be capped by a hardcoded 120s client timeout even when:

- the caller has a larger tool budget
- the gateway `toolTimeout` is larger
- the backend is still actively processing the request

## Suggested fix

Preferred:

- Make the HTTP client request timeout configurable instead of hardcoded.

Reasonable options:

1. Derive the HTTP client timeout from gateway `toolTimeout`.
2. Add a new explicit timeout field for HTTP backend request execution.
3. If both exist, use a clear precedence rule.
4. Consider relying on request context deadlines consistently instead of a fixed `http.Client.Timeout` when possible.

The cleanest model seems to be:

- `connect_timeout`: connection/setup/fallback
- `toolTimeout`: gateway/tool execution budget
- HTTP request timeout: either derived from `toolTimeout` or configurable explicitly, but not silently hardcoded to 120s

## Potential acceptance criteria

- HTTP MCP backend tools can run longer than 120s when gateway/tool configuration allows it.
- A configured larger `toolTimeout` is actually honored for HTTP backends.
- There is test coverage showing a long-running HTTP backend request is not cut off at 120s when configuration permits it.
- Documentation clearly distinguishes:
  - startup timeout
  - connect timeout
  - tool execution timeout
  - HTTP backend request timeout

## Minimal reproduction idea

1. Start `gh-aw-mcpg` with an HTTP backend server.
2. Expose a tool that intentionally sleeps for more than 120 seconds before returning.
3. Configure a gateway `toolTimeout` larger than 120 seconds.
4. Invoke the tool through the gateway.
5. Observe that the request still fails around 120 seconds with an HTTP client timeout.

## Additional context

This came up while debugging a real GitHub Actions workflow that already had larger tool budgets configured. The workflow-side timeouts were not the limiting factor. The failure was due to the transport ceiling in the HTTP backend path.

If useful, I can also provide:

- the full caller and backend log excerpts from the workflow run
- a concrete repro server
- a proposed patch direction for threading the timeout through configuration and transport creation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP backend requests are capped by a hardcoded 120s client timeout, ignoring larger configured tool budgets #3910

Summary

Observed behavior

Why this looks like an mcpg issue

Relevant implementation details

Why `connect_timeout` is not the fix

Impact

Example failure

Expected behavior

Actual behavior

Suggested fix

Potential acceptance criteria

Minimal reproduction idea

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

HTTP backend requests are capped by a hardcoded 120s client timeout, ignoring larger configured tool budgets #3910

Description

Summary

Observed behavior

Why this looks like an mcpg issue

Relevant implementation details

Why connect_timeout is not the fix

Impact

Example failure

Expected behavior

Actual behavior

Suggested fix

Potential acceptance criteria

Minimal reproduction idea

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Why `connect_timeout` is not the fix