Skip to content

feat: thread-safe SSH/NETCONF connection pooling#27

Open
h-network wants to merge 3 commits into
Juniper:mainfrom
h-network:connection-pooling
Open

feat: thread-safe SSH/NETCONF connection pooling#27
h-network wants to merge 3 commits into
Juniper:mainfrom
h-network:connection-pooling

Conversation

@h-network

Copy link
Copy Markdown

feat: thread-safe SSH/NETCONF connection pooling

Summary

Reuse SSH/NETCONF sessions across tool invocations instead of opening
and tearing down a connection per call.

  • Thread-safe ConnectionPool with per-router locking
  • Auto-detection and reconnection of stale sessions
  • Background cleanup of idle connections (configurable timeout)
  • Graceful shutdown on server exit

Performance

Tested on 5 Junos routers, 7 commands each (35 operations):

Mode Time Speedup
Original 24.5s baseline
Pooled + sequential 4.0s 6.2x
Pooled + parallel 1.2s 20.7x

Benchmark methodology and reproduction steps: https://github.com/h-network/junos-mcp-server/tree/h-dev/benchmark

Closes #26

  Reuse SSH sessions across tool invocations instead of connect/teardown
  per call. Thread-safe ConnectionPool with per-router locking, stale
  detection, idle cleanup, and graceful shutdown.

  Benchmarked: 6.2x sequential, 20.7x parallel (5 routers, 35 ops).
Comment thread jmcp.py Outdated
Wrap int() parsing of JMCP_POOL_IDLE_TIMEOUT in try/except ValueError, log a warning, and fall back to the 300s default — mirrors the pattern in get_timeout_with_fallback.
Addresses review feedback on Juniper#27.
Addresses the event-loop-blocking concern raised on this PR. Four
handlers patched to dispatch the sync ConnectionPool.get_connection()
call through anyio.to_thread.run_sync — same pattern already used in
handle_execute_junos_command_batch:

- handle_execute_junos_command          (jmcp.py:1163)
- handle_gather_device_facts            (jmcp.py:1810, _gather_facts_sync)
- handle_execute_junos_pfe_command      (jmcp.py:1963)
- handle_load_and_commit_config         (jmcp.py:2023, _load_and_commit_sync)

handle_render_and_apply_j2_template (line 1632) intentionally not
included — has interleaved `await context.info(...)` inside the
connection block; deferred to a separate PR.

Validated against a 5-router vMX lab. Detail in the PR comment.
@nileshsimaria

Copy link
Copy Markdown
Collaborator

Looks good.

@nileshsimaria

Copy link
Copy Markdown
Collaborator

@h-network once you resolve conflict, it will merge. Thanks for your contributions. Appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Connection pooling: performance improvement for SSH/NETCONF sessions

2 participants