Beyond LangChain & LangGraph: What a Production Agent SDK Must Provide (By Design)

Context and intent

LangChain and LangGraph are excellent foundations for building LLM applications and agent-like flows. They accelerate prototyping and provide powerful primitives for:

message-based model interaction,
tool calling,
retrieval integration,
and graph-based control flow.

However, when building enterprise-grade, portable, and secure agentic systems, teams quickly encounter a set of requirements that are intentionally out of scope for these frameworks. This is not a critique. It is a recognition that a framework for composing LLM logic is not the same thing as a runtime and governance layer for production agents.

The goal of this note is to make those “out-of-scope” areas explicit and to outline what an Agent SDK layer above LangChain/LangGraph should provide—so that agent developers can focus on capabilities and reasoning rather than infrastructure and security.

What LangChain/LangGraph intentionally do not aim to be

1) A runtime lifecycle manager for agents

LangGraph can define how a graph executes, but it does not define how an agent behaves as a managed runtime component:

when it is created and torn down,
what is per-process vs per-session vs per-turn,
how resources are initialized and cleaned up,
how concurrency and cancellation should be handled safely.

SDK responsibility: define explicit lifecycle hooks and ownership boundaries.

2) An identity and authorization system

Enterprise agents must act on behalf of a user, not as an anonymous process. Identity propagation and authorization are not part of LangChain/LangGraph’s contract:

per-user access tokens,
delegated permissions,
policy enforcement per tool and per dataset,
audit requirements.

SDK responsibility: make identity first-class in runtime context and enforce policy at integration points.

3) A recoverability and correctness layer for long-running workflows

LangGraph checkpointing is useful, but production correctness requires additional decisions:

what state is durable and versioned,
how restarts behave (resume vs replay vs restart),
idempotency strategy for tool side effects,
compensation/retry semantics and timeouts.

SDK responsibility: define persistence contracts and failure semantics, optionally integrating with a workflow engine.

4) An application-level message/event protocol

Framework messages are designed primarily for model interaction, not for:

durable storage,
UI rendering,
stable ordering across streaming and final outputs,
typed “agent events” (planning steps, tool execution, evidence, etc.).

SDK responsibility: define stable message identifiers, ordering guarantees, event typing, and forward-compatible schemas.

5) A governed integration surface for external services

In production, “tools” are not just Python callables. They are integration points with:

authenticated APIs,
structured IO contracts,
controlled error taxonomy,
audit and observability requirements,
multi-tenant boundaries.

SDK responsibility: provide service clients and capability wrappers with predictable contracts and security.

6) A portability layer across constrained environments

Production deployments may run on-prem, disconnected, or in regulated zones. Portability requires:

explicit configuration boundaries (secrets vs non-secrets),
minimal assumptions about network and storage,
stable behavior when dependencies degrade.

SDK responsibility: define deployment-agnostic abstractions and avoid implicit global dependencies.

The Agent SDK: the missing layer above the frameworks

A practical “Agent SDK above LangChain/LangGraph” should not reinvent orchestration. It should provide contracts and runtime governance around the orchestration.

Core design principle

Agent developers should consume services by injection and use stable contracts. They should not implement:

authentication plumbing,
transport setup,
discovery/connection management,
policy gates,
low-level observability.

Instead, they implement:

agent reasoning structure (graphs),
capability selection,
domain logic,
prompts and post-processing,
deterministic composition.

Minimal “SDK surface” that enables healthy implementations

A) RuntimeContext (identity + scope + environment)

A single object that is always available during execution, containing:

user_id, user_groups (or claims)
session_id / exchange_id (or equivalent trace keys)
identity tokens or token providers (never hard-coded secrets)
execution scope (corpus, session attachments, tenant constraints)
language and UI preferences (optional)
telemetry correlation identifiers

Key property: agent code reads context; it does not construct context.

B) Lifecycle hooks (explicit and testable)

A standard lifecycle contract for agent runtime components:

async_init(runtime_context) for initialization
aclose() for cleanup
a clear definition of what is per-turn vs per-session vs shared

Key property: lifecycle must be predictable under cancellation and concurrency.

C) Capability model (separate “what can be done” from “who reasons”)

Introduce a distinction:

Capabilities: callable units with typed IO contracts (tools, services, sub-agents)
Agents: reasoning orchestrators that select and sequence capabilities

Key property: capabilities can be tested in isolation and governed centrally.

D) Service injection (platform-owned clients)

Provide injectable, identity-aware clients for:

retrieval (vector search, rerank, scope routing),
document and asset storage (upload/download),
MCP transport management,
metrics and tracing,
optional agent-to-agent calls (A2A) under policy control.

Key property: the SDK owns authentication and policy enforcement.

E) Policy gates (explicit choke points)

A production SDK must have explicit, auditable choke points where policy is enforced:

which tools are exposed to which agent,
which datasets/scopes are accessible to which user,
which transports are allowed (and under which constraints),
what is allowed to leave the trust boundary (e.g., external calls).

Key property: policy is testable, observable, and independent of prompts.

F) Event model for observability and UI semantics

A structured event stream (or message metadata) describing:

graph steps (planning, retrieval, tool calls, synthesis),
evidence selection (sources, scores, scope),
errors (categorized, user-facing vs internal),
latency/cost metrics.

Key property: observability is agent-native, not only LLM-call tracing.

Example: MCP integration as a “SDK concern” (not an agent concern)

MCP is a strong illustration of why the SDK layer matters:

connecting to servers,
propagating end-user identity,
selecting which tools are exposed,
handling token refresh and failures safely,
ensuring cancellation and cleanup correctness,
producing audit logs and KPIs.

LangChain/LangGraph can orchestrate tool nodes, but they do not define how tool transports are bound per user or governed. That is by design and belongs in the SDK/runtime layer.

Target outcome: the agent developer simply declares that it needs MCP-backed capabilities, and the SDK injects a ready-to-use toolset bound to the runtime context and policy.

Agent-to-Agent (A2A) Invocation: A Missing First-Class SDK Concept

Why this deserves explicit treatment

In many agent frameworks, calling another agent is treated as a convenience feature—often exposed as a direct method call. Projects like Agno illustrate this well by making agents feel callable and composable, which significantly improves developer ergonomics.

However, this apparent simplicity hides a deeper issue: agent-to-agent interaction is not a single semantic operation. In production-grade systems, it spans multiple fundamentally different execution and trust models. Treating all of them as the same abstraction leads to unclear security boundaries, ambiguous lifecycles, and accidental coupling between agents.

In Fred today, one agent can technically invoke another via internal AgentFlow mechanisms. This works, but it is intentionally not exposed as a stable public API. The concern is not feasibility, but correctness: freezing the wrong abstraction too early would make later governance and portability much harder.

The core question

What does it actually mean when one agent “calls” another?

There are several distinct cases that must not be conflated.

1. In-process delegation (capability-style)

Agent B is used by Agent A like a callable capability.
Same user identity.
Same session.
Same runtime and lifecycle.
No persistence boundary.

This is closest to a function-call model and aligns with what frameworks like Agno emphasize.

2. Sub-agent reasoning

Agent A delegates a reasoning task to Agent B.
Agent B may perform retrieval, tool usage, or multi-step planning.
Identity is explicitly delegated, not implicitly shared.
Execution must remain traceable as part of Agent A’s reasoning flow.

This requires explicit delegation semantics and clear observability.

3. Asynchronous agent tasking

Agent A creates a task for Agent B.
Agent B may run later, retry, resume, or fail independently.
Results are consumed asynchronously.
Persistence, idempotency, and failure semantics are required.

This pattern overlaps with workflow orchestration rather than pure reasoning.

4. Cross-boundary agent invocation

Agent B runs in another process, node, or trust zone.
Strong authentication, authorization, and auditing are mandatory.
Data exchange must follow explicit, versioned contracts.

At this point, agent interaction becomes a distributed systems problem.

Why a single “simple” API is insufficient

A single abstraction such as other_agent.run(prompt) cannot safely represent all of these cases. While convenient, such an API:

hides identity propagation,
obscures permission boundaries,
bypasses lifecycle management,
and makes observability and auditing ambiguous.

In production environments, ambiguity is the enemy of safety.

Proposed SDK direction: Explicit A2A semantics

Rather than a single agent-call primitive, the SDK should expose explicit interaction modes that encode intent and constraints.

Conceptually:

The calling agent declares how it wants to interact (capability, delegation, task).
The runtime mediates the call.
Identity, scope, policy, and lifecycle are enforced centrally.

Example (conceptual, not prescriptive):

target agent: DocInsight
mode: capability | delegate | task
input payload: typed, versioned
execution context: explicit identity and scope

The key point is not syntax, but semantic clarity.

Separation of responsibilities

Agents reason and orchestrate.
Capabilities execute bounded, typed actions.
A2A calls are mediated by the runtime, not implemented as direct object calls.

Agent code should never:

instantiate another agent directly,
manually propagate identity or tokens,
bypass runtime lifecycle hooks,
or sidestep policy enforcement.

Design goals for A2A in the SDK

A first-class A2A API should guarantee:

Explicit identity delegation (never implicit sharing).
Clear lifecycle ownership (who starts, who stops, who retries).
Policy enforcement at agent boundaries.
End-to-end traceability and auditability.
Portability across in-process, on-prem, and distributed deployments.

Or, stated more succinctly:

The SDK should make agent-to-agent interaction easy to express and impossible to misuse by default.

What this enables organizationally

This separation creates a clean collaboration model between teams:

The SDK/platform team defines contracts, governance, lifecycle, security, portability, and observability.
The agent teams build domain capabilities and reasoning logic on top of those contracts.

This is the difference between “agents as prototypes” and “agents as products.”

Topics of Discussions

If two systems already share LangChain/LangGraph as the foundation, the natural question becomes:

“Which contracts must exist above the frameworks so that an agent can run securely and portably on any runtime (on-prem, disconnected, customer-controlled) without re-implementing platform concerns?”

A converged SDK should focus on:

RuntimeContext + identity propagation
Lifecycle and resource management
Capability model (capability vs agent)
Service injection (retrieval, storage, MCP, telemetry)
Policy choke points
Event/message schema for observability and UI

Or said differently:

LangChain/LangGraph define how to compose reasoning.
The SDK defines how to run that reasoning safely in the real world.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Beyond LangChain & LangGraph: What a Production Agent SDK Must Provide (By Design)

Context and intent

What LangChain/LangGraph intentionally do not aim to be

1) A runtime lifecycle manager for agents

2) An identity and authorization system

3) A recoverability and correctness layer for long-running workflows

4) An application-level message/event protocol

5) A governed integration surface for external services

6) A portability layer across constrained environments

The Agent SDK: the missing layer above the frameworks

Core design principle

Minimal “SDK surface” that enables healthy implementations

A) RuntimeContext (identity + scope + environment)

B) Lifecycle hooks (explicit and testable)

C) Capability model (separate “what can be done” from “who reasons”)

D) Service injection (platform-owned clients)

E) Policy gates (explicit choke points)

F) Event model for observability and UI semantics

Example: MCP integration as a “SDK concern” (not an agent concern)

Agent-to-Agent (A2A) Invocation: A Missing First-Class SDK Concept

Why this deserves explicit treatment

The core question

1. In-process delegation (capability-style)

2. Sub-agent reasoning

3. Asynchronous agent tasking

4. Cross-boundary agent invocation

Why a single “simple” API is insufficient

Proposed SDK direction: Explicit A2A semantics

Separation of responsibilities

Design goals for A2A in the SDK

What this enables organizationally

Topics of Discussions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages