Skip to content

Compose / bulk operations: run multiple actions in one MCP tool call (start with send_keys) #49

@tony

Description

@tony

Problem

Every MCP tool call costs at least one round-trip between the agent
and the server. When an agent needs to send several keystroke
sequences to a pane — or coordinate send_keys across multiple
panes — it has to make one MCP call per operation today. Each call
spawns its own libtmux subprocess and serialises through the MCP
transport layer. Agents end up burning turns and tokens describing
sequences that tmux itself can express in a single invocation.

A composed / bulk surface would let an agent request "run these N
operations in this order, with this error policy" as one MCP call.

tmux already has the relevant primitives

tmux's command queue supports chained commands via \; with
useful semantics that translate cleanly to an MCP tool contract:

  • Sequential execution with shared group ID. Each command in
    tmux cmd1 \; cmd2 \; becomes a cmdq_item sharing a group
    (see cmd-queue.c#L516).
  • Fail-fast for free. If a chained command returns
    CMD_RETURN_ERROR, the rest of the group is removed
    (cmd-queue.c#L780,
    cmd-queue.c#L470).
  • Hooks fire between chained commands
    (cmd-queue.c#L653),
    so user-installed after-send-keys etc. can perturb a batch —
    consistent with single-call semantics, but worth surfacing in
    docs.
  • send-keys is already poly-keystroke within a single
    invocation.
    tmux send-keys -t %1 "ls" Enter "cd ~" Enter
    injects four sequences in order
    (cmd-send-keys.c#L231).
    But -t accepts a single target
    (cmd-send-keys.c#L40),
    so cross-pane batching requires command chaining.
  • Grammar. ; is the sole separator between commands
    (cmd-parse.y#L386).

fastmcp and Pydantic already support the input shape

  • list[Model] parameters are routine in fastmcp's own examples
    (e.g. examples/complex_inputs.py, examples/memory.py).
  • Annotated[list[Op], Field(min_length=1, max_length=N)] renders
    as standard JSON Schema minItems / maxItems.
  • Pydantic surfaces per-item validation errors with location
    indices (loc: (3,)), so clients learn exactly which op was
    malformed.
  • ctx.report_progress(k, total) is fastmcp's standard streaming
    primitive for long batches.
  • Discriminated unions are accepted but fastmcp strips the
    discriminator field from the client-facing schema.
    Pydantic
    still resolves the union server-side; clients see anyOf
    without the discriminator hint. Heterogeneous batches work but
    lose some client ergonomics.
  • MCP spec only requires the root be an object; a tool with
    operations: list[Op] satisfies that.

Three strategies (pick one to start; others can follow)

Strategy A — Homogeneous list of send_keys ops (recommended for the first cut)

class SendKeysOp(BaseModel):
    keys: str
    pane_id: str | None = None
    session_name: str | None = None
    session_id: str | None = None
    window_id: str | None = None
    enter: bool = True
    literal: bool = False
    suppress_history: bool = False


class SendKeysOpResult(BaseModel):
    pane_id: str | None
    success: bool
    error: str | None = None


async def send_keys_batch(
    operations: Annotated[list[SendKeysOp], Field(min_length=1, max_length=50)],
    on_error: Literal["stop", "continue"] = "stop",
    socket_name: str | None = None,
    ctx: Context | None = None,
) -> list[SendKeysOpResult]:
    """Run several send_keys operations as one MCP call."""
  • Pros: clean fastmcp idiom; mirrors examples in the upstream
    library; per-item Pydantic validation; explicit on_error
    policy maps onto tmux's group-fail behavior; the single-op
    send_keys tool keeps its existing contract for the common
    case.
  • Cons: only batches one verb. Agents who want to interleave
    send_keys with wait_for_text or paste_text still need
    multiple calls.

Strategy B — Heterogeneous pane-op batch

class KeysOp(BaseModel):
    kind: Literal["keys"]
    keys: str
    pane_id: str
    enter: bool = True


class PasteOp(BaseModel):
    kind: Literal["paste"]
    text: str
    pane_id: str


class WaitOp(BaseModel):
    kind: Literal["wait_for_text"]
    pane_id: str
    pattern: str
    timeout: float = 8.0


PaneOp = Annotated[KeysOp | PasteOp | WaitOp, Field(discriminator="kind")]


async def pane_ops(
    operations: Annotated[list[PaneOp], Field(min_length=1, max_length=50)],
    on_error: Literal["stop", "continue"] = "stop",
    ...
) -> list[OpResult]: ...
  • Pros: expresses agent intent at the verb level
    (send_keys → wait_for_text → capture_pane in one call).
    Highest leverage for typical agent workflows.
  • Cons: fastmcp's _strip_discriminator() removes the
    discriminator field from the schema sent to clients — they
    still send valid payloads, but lose the upfront tag hint. Tool
    name no longer maps to a single tmux verb, so naming /
    discovery suffer.

Strategy C — Native tmux chained execution

Same MCP-level shape as A, but the implementation builds a single
tmux send-keys -t %1 ... \; send-keys -t %2 ... invocation and
execs it once instead of looping libtmux's Pane.send_keys.

  • Pros: one subprocess instead of N; free fail-fast via
    tmux's cmdq_remove_group(); lower latency on large batches.
  • Cons: tighter coupling to tmux's argv syntax (no libtmux
    typed wrappers to lean on); careful escaping of ; and shell
    metacharacters in keystrokes; error attribution per-item is
    harder because tmux's chain stops on first failure without
    emitting an index.

Recommendation

Start with Strategy A. It is the lowest-risk path:

  • Single homogeneous op type — keeps the schema and docs tight.
  • on_error="stop" default mirrors tmux's group-fail behavior,
    so the contract feels native.
  • max_length=50 (or similar) caps abuse.
  • ctx.report_progress per item for long batches.
  • Existing send_keys tool stays — no breaking change.

Strategy B becomes interesting once real agent workflows demand
heterogeneous batches (e.g. send_keys → wait_for_text → capture_pane). Strategy C is a future optimisation if profiling
shows the per-op subprocess cost matters.

Open questions for the design discussion

  • on_error="continue" semantics. When op docs: DX improvements — docstrings, recipes, gotchas, prompting guide #3 fails and ops
    4–N still run, what's in the result for op 3? success=False, error="...". Should ops 4–N see the failure (return early
    themselves) or proceed independently? Lean toward proceed.
  • Per-pane atomicity. Should the batch interleave ops across
    panes deterministically, or just run them in declared order?
    Declared order is simpler.
  • Result schema. Does the result want timing info per op
    (elapsed_seconds) like wait_for_text returns? Probably yes
    for diagnostics.
  • Naming. send_keys_batch reads cleanly. send_keys keeps
    the existing surface. Alternative: a bulk= flag on
    send_keys that accepts keys: list[str] | str — but that's
    schema-overloading and harder to validate cleanly in Pydantic.
  • Progress reporting overhead. For very small batches (2–3
    ops) the progress notifications may cost more than they save.
    Threshold for emission?

Acceptance criteria

For the Strategy A implementation:

  • send_keys_batch tool registered with Pydantic-validated
    list[SendKeysOp] input and list[SendKeysOpResult] output.
  • on_error parameter with "stop" / "continue" literals,
    default "stop".
  • Per-op ctx.report_progress(k, total=N) emitted when ctx is
    available.
  • Tests: validates the input bounds (min_length=1,
    max_length); covers on_error="stop" short-circuiting;
    covers on_error="continue" partial success.
  • Docs: the tool's docstring states the order guarantee, error
    modes, and the relationship to single-op send_keys.
  • CHANGES entry describes the new surface (single user-facing
    capability, not split per-mechanism).

References

  • tmux command-queue mechanics:
    cmd-queue.c
    at the cited line anchors.
  • tmux send-keys argv handling:
    cmd-send-keys.c.
  • fastmcp list-input examples in the upstream repo's
    examples/complex_inputs.py and examples/memory.py.
  • Pydantic v2 discriminated-union pattern:
    Annotated[A | B | C, Field(discriminator="kind")].

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions