Description
When using spring-ai-starter-mcp-server-webmvc with Streamable HTTP transport, the server does not properly release sockets and Tomcat threads after an MCP client disconnects. Connections accumulate in TCP CLOSE-WAIT state indefinitely, eventually exhausting the Tomcat thread pool and making the server completely unresponsive — including to health checks.
Environment
- Spring AI: 1.1.4
- Spring Boot: 3.4.1
- Java: JDK 25
- Server: Tomcat 10.1.34 (WAR deployment)
- OS: Linux (Kubernetes, 2 CPU / 4 GB)
Configuration
spring:
ai:
mcp:
server:
type: SYNC
protocol: STREAMABLE
streamable-http:
mcp-endpoint: /mcp
keep-alive-interval: 0s
Steps to Reproduce
- Deploy an MCP Server using
spring-ai-starter-mcp-server-webmvc with Streamable HTTP transport
- Have an external MCP client send
POST /mcp requests (initialize → tools/call)
- Client receives the response and closes the TCP connection (sends FIN)
- Repeat at moderate frequency (e.g., 2-3 requests/second from a load balancer)
- After ~60 seconds, observe socket states on the server
Observed Behavior
$ ss -tlnp | grep 8080
LISTEN 151 150 *:8080 *:* # backlog full
$ ss -tnp | grep 8080 | head -5
CLOSE-WAIT 115 0 [::ffff:10.125.87.86]:8080 [::ffff:10.125.87.4]:47140
CLOSE-WAIT 115 0 [::ffff:10.125.87.86]:8080 [::ffff:10.125.87.4]:42756
CLOSE-WAIT 115 0 [::ffff:10.125.87.86]:8080 [::ffff:10.125.87.4]:47446
CLOSE-WAIT 115 0 [::ffff:10.125.87.86]:8080 [::ffff:10.125.87.4]:50138
CLOSE-WAIT 115 0 [::ffff:10.125.87.86]:8080 [::ffff:10.125.87.4]:43160
$ ss -tnp | grep 8080 | wc -l
150 # all threads consumed
$ curl --max-time 5 http://localhost:8080/health
curl: (28) Failed to connect to localhost port 8080: Connection timed out
- All 150 connections are in
CLOSE-WAIT (client sent FIN, server never called close)
- All connections come from the same upstream load balancer IP
- Tomcat thread pool is fully exhausted — no new requests can be processed
- The application itself started successfully (
Started in 12.6 seconds)
Expected Behavior
When a client closes the TCP connection, the MCP server transport should:
- Detect the peer shutdown (IOException on write or read returning -1)
- Complete or cancel the Servlet async context
- Close the server-side socket
- Release the Tomcat thread back to the pool
- Remove the session from the internal session map
Root Cause Analysis
The Streamable HTTP transport appears to hold open an SSE-style async response for each MCP session. When the client disconnects:
- The
AsyncContext is never completed or timed out
- The server-side
OutputStream/Sink has no subscriber check
- The Servlet container keeps the thread allocated to the async response
- The OS socket enters CLOSE-WAIT because the server process never calls
close()
Impact
- Severity: Critical in production with active MCP clients
- Server becomes completely unresponsive within seconds under moderate load
- K8s rolling deployments fail because new pods are immediately flooded by retrying clients
- No automatic recovery without restart + stopping upstream traffic
Workaround
Configure Tomcat to force-close idle connections:
server:
tomcat:
connection-timeout: 30000
keep-alive-timeout: 30000
max-connections: 200
threads:
max: 200
This allows Tomcat to reclaim stuck connections after 30 seconds, but is not a proper fix.
Suggested Fix
Register an AsyncListener on the servlet async context when opening the SSE stream:
asyncContext.addListener(new AsyncListener() {
@Override
public void onComplete(AsyncEvent event) { cleanupSession(sessionId); }
@Override
public void onTimeout(AsyncEvent event) { cleanupSession(sessionId); }
@Override
public void onError(AsyncEvent event) { cleanupSession(sessionId); }
@Override
public void onStartAsync(AsyncEvent event) { }
});
Additionally, set a reasonable asyncTimeout on the context (e.g., 5 minutes) so that abandoned sessions are eventually reclaimed even without explicit client disconnect detection.
Related
This bug is related to (but distinct from) a KeepAliveScheduler issue in the MCP Java SDK where dead sessions are never evicted after ping failures. I have filed that separately at modelcontextprotocol/java-sdk.
Description
When using
spring-ai-starter-mcp-server-webmvcwith Streamable HTTP transport, the server does not properly release sockets and Tomcat threads after an MCP client disconnects. Connections accumulate in TCPCLOSE-WAITstate indefinitely, eventually exhausting the Tomcat thread pool and making the server completely unresponsive — including to health checks.Environment
Configuration
Steps to Reproduce
spring-ai-starter-mcp-server-webmvcwith Streamable HTTP transportPOST /mcprequests (initialize → tools/call)Observed Behavior
CLOSE-WAIT(client sent FIN, server never called close)Started in 12.6 seconds)Expected Behavior
When a client closes the TCP connection, the MCP server transport should:
Root Cause Analysis
The Streamable HTTP transport appears to hold open an SSE-style async response for each MCP session. When the client disconnects:
AsyncContextis never completed or timed outOutputStream/Sinkhas no subscriber checkclose()Impact
Workaround
Configure Tomcat to force-close idle connections:
This allows Tomcat to reclaim stuck connections after 30 seconds, but is not a proper fix.
Suggested Fix
Register an
AsyncListeneron the servlet async context when opening the SSE stream:Additionally, set a reasonable
asyncTimeouton the context (e.g., 5 minutes) so that abandoned sessions are eventually reclaimed even without explicit client disconnect detection.Related
This bug is related to (but distinct from) a KeepAliveScheduler issue in the MCP Java SDK where dead sessions are never evicted after ping failures. I have filed that separately at modelcontextprotocol/java-sdk.