Skip to content

Commit a2ea91a

Browse files
committed
Make limits non-middleware
1 parent 3abde98 commit a2ea91a

9 files changed

Lines changed: 361 additions & 408 deletions

File tree

splunklib/ai/README.md

Lines changed: 30 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -613,7 +613,7 @@ triggers the retry logic described above. A custom `model_middleware` can interc
613613
to observe, log, or override the retry behavior. A custom `model_middleware` can also raise
614614
the `StructuredOutputGenerationException` manually to reject structured output and force a re-generation.
615615

616-
The maximal number of re-tries is limited per agent loop invocation see [Default limit middlewares](#default-limit-middlewares).
616+
The maximal number of re-tries is limited per agent loop invocation see [Default limits](#default-limits).
617617

618618
### Subagents with structured output/input
619619

@@ -894,103 +894,87 @@ model = OpenAIModel(...)
894894
service = connect(...)
895895

896896
@before_model
897-
def log_usage(req: ModelRequest) -> None:
898-
logger.debug(f"Steps: {req.state.total_steps}, Tokens: {req.state.token_count}")
897+
def log_steps(req: ModelRequest) -> None:
898+
logger.debug(f"Steps: {len(req.state.messages)}")
899899

900900

901901
async with Agent(
902902
model=model,
903903
service=service,
904904
system_prompt="...",
905-
middleware=[log_usage],
905+
middleware=[log_steps],
906906
) as agent: ...
907907
```
908908

909-
The hooks can stop the Agentic Loop under custom conditions by raising exceptions.
910-
The logic of the hook can be more advanced and include multiple conditions, for example, based on both token usage and execution time:
909+
The hooks can stop the Agentic Loop under custom conditions by raising exceptions, for example:
911910

912911
```py
913912
from splunklib.ai.hooks import before_model
914913
from splunklib.ai.middleware import AgentMiddleware, ModelRequest
915914

916-
def token_and_step_limit(token_limit: float, step_limit: int) -> AgentMiddleware:
915+
def token_and_message_limit(message_limit: int) -> AgentMiddleware:
917916
@before_model
918917
def _hook(req: ModelRequest) -> None:
919-
if req.state.token_count > token_limit or req.state.total_steps >= step_limit:
918+
if len(req.state.messages) >= message_limit:
920919
raise Exception("Stopping Agentic Loop")
921920

922921
return _hook
923922

924923

925924
async with Agent(
926925
...,
927-
middleware=[token_and_step_limit(token_limit=10_000, step_limit=5)],
926+
middleware=[token_and_message_limit(message_limit=5)],
928927
) as agent: ...
929928
```
930929

931-
## Default limit middlewares
930+
## Default limits
932931

933932
Every `Agent` automatically applies sane default limits to prevent runaway execution
934-
or excessive token usage. Default limit middlewares are appended after any user-supplied
935-
middleware, so they always act on the final state of the request. If you override one of
936-
the defaults by passing your own instance, you are responsible for its position in the
937-
chain - place it last if you want the same behavior.
933+
or excessive token usage.
938934

939-
| Middleware | Default | Measured |
935+
| Limit | Default | Measured |
940936
|---|---|---|
941-
| `TokenLimitMiddleware` | 200 000 tokens | token count of messages passed to the model |
942-
| `StepLimitMiddleware` | 100 steps | steps taken |
943-
| `TimeoutLimitMiddleware` | 600 seconds (10 minutes) | per `invoke` call |
944-
| `StructuredOutputRetryLimitMiddleware` | 3 retries | per `invoke` call |
937+
| `max_tokens` | 200 000 tokens | token count of messages passed to the model |
938+
| `max_steps` | 100 steps | number of messages in the conversation |
939+
| `timeout` | 600 seconds (10 minutes) | per `invoke` call |
940+
| `max_structured_output_retires` | 3 retries | per `invoke` call |
945941

946-
`TokenLimitMiddleware` and `StepLimitMiddleware` check the values from the messages passed to the
947-
model on each call. `TimeoutLimitMiddleware` and `StructuredOutputRetryLimitMiddlewa` resets its
948-
deadline/limit on each `invoke`, so effectively these limit only the agent loop.
942+
`max_tokens` and `max_steps` are checked against the messages passed to the model on each call.
943+
`timeout` and `max_structured_output_retires` reset on each `invoke`, so they limit only the
944+
current agent loop invocation.
949945

950946
When a limit is exceeded, the agent raises the corresponding exception:
951-
`TokenLimitExceededException`, `StepsLimitExceededException`, or `TimeoutExceededException`,
947+
`TokenLimitExceededException`, `StepsLimitExceededException`, `TimeoutExceededException`, or
952948
`StructuredOutputRetryLimitExceededException`.
953949

954950
### Overriding defaults
955951

956-
To override a specific limit, pass your own instance of the corresponding middleware
957-
class. The default for that limit is suppressed automatically - the other defaults
958-
remain active:
952+
Limits are configured via the `AgentLimits` dataclass passed to the `Agent` constructor.
953+
Only the fields you specify are overridden; the rest keep their defaults:
959954

960955
```py
961-
from splunklib.ai.limits import (
962-
TokenLimitMiddleware,
963-
StepLimitMiddleware,
964-
TimeoutLimitMiddleware,
965-
StructuredOutputRetryLimitMiddleware,
966-
)
956+
from splunklib.ai.limits import AgentLimits
967957

968958
async with Agent(
969959
...,
970-
middleware=[
971-
TokenLimitMiddleware(50_000), # overrides default 200 000; other defaults still apply
972-
],
960+
limits=AgentLimits(max_tokens=50_000), # overrides default 200 000; other defaults still apply
973961
) as agent: ...
974962
```
975963

976-
To override all defaults, pass all of these to Agent's middleware list:
964+
To override all defaults:
977965

978966
```py
979967
async with Agent(
980968
...,
981-
middleware=[
982-
StructuredOutputRetryLimitMiddleware(0), # no-retries.
983-
TokenLimitMiddleware(50_000),
984-
StepLimitMiddleware(10),
985-
TimeoutLimitMiddleware(30.0),
986-
],
969+
limits=AgentLimits(
970+
max_tokens=50_000,
971+
max_steps=10,
972+
timeout=30.0,
973+
max_structured_output_retires=0, # no retries
974+
),
987975
) as agent: ...
988976
```
989977

990-
**Note**: When overriding limit middlewares, order matters. Place `StructuredOutputRetryLimitMiddleware`
991-
first and `TokenLimitMiddleware`, `StepLimitMiddleware`, and `TimeoutLimitMiddleware` last,
992-
otherwise the limits may not behave as expected.
993-
994978
There is no explicit opt-out - the intent is that agents should always have some guardrails.
995979

996980
## Logger

splunklib/ai/agent.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
from splunklib.ai.conversation_store import ConversationStore
2727
from splunklib.ai.core.backend import AgentImpl
2828
from splunklib.ai.core.backend_registry import get_backend
29+
from splunklib.ai.limits import AgentLimits
2930
from splunklib.ai.messages import AgentResponse, BaseMessage, HumanMessage, OutputT
3031
from splunklib.ai.middleware import AgentMiddleware
3132
from splunklib.ai.model import PredefinedModel
@@ -47,6 +48,8 @@
4748
_testing_app_id: str | None = None
4849

4950
DEFAULT_TOOL_SETTINGS = ToolSettings(local=False, remote=None)
51+
DEFAULT_AGENT_LIMITS = AgentLimits()
52+
5053
_SPLUNK_SYSTEM_USER = "splunk-system-user"
5154

5255

@@ -133,6 +136,10 @@ class Agent(BaseAgent[OutputT]):
133136
134137
Never invoke an Agent using the same thread_id more than once concurrently
135138
while using the same conversation_store.
139+
140+
limits:
141+
Optional `AgentLimits` instance controlling the built-in safety limits.
142+
When omitted, sane defaults are applied automatically.
136143
"""
137144

138145
_impl: AgentImpl[OutputT] | None
@@ -149,6 +156,7 @@ def __init__(
149156
output_schema: type[OutputT] | None = None,
150157
input_schema: type[BaseModel] | None = None, # Only used by Subagents
151158
middleware: Sequence[AgentMiddleware] | None = None,
159+
limits: AgentLimits = DEFAULT_AGENT_LIMITS,
152160
name: str = "", # Only used by Subagents
153161
description: str = "", # Only used by Subagents
154162
logger: Logger | None = None,
@@ -169,6 +177,7 @@ def __init__(
169177
logger=logger,
170178
conversation_store=conversation_store,
171179
thread_id=thread_id if thread_id is not None else str(uuid4()),
180+
limits=limits,
172181
)
173182

174183
self._service = service

splunklib/ai/base_agent.py

Lines changed: 9 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,7 @@
2222

2323
from splunklib.ai.conversation_store import ConversationStore
2424
from splunklib.ai.limits import (
25-
DEFAULT_STEP_LIMIT,
26-
DEFAULT_STRUCTURED_OUTPUT_RETRY_LIMIT,
27-
DEFAULT_TIMEOUT_SECONDS,
28-
DEFAULT_TOKEN_LIMIT,
29-
StepLimitMiddleware,
30-
StructuredOutputRetryLimitMiddleware,
31-
TimeoutLimitMiddleware,
32-
TokenLimitMiddleware,
25+
AgentLimits,
3326
)
3427
from splunklib.ai.messages import AgentResponse, BaseMessage, OutputT
3528
from splunklib.ai.middleware import AgentMiddleware
@@ -53,6 +46,7 @@ class BaseAgent(Generic[OutputT], ABC): # noqa: UP046 TODO[BJ]
5346
_logger: logging.Logger
5447
_conversation_store: ConversationStore | None = None
5548
_thread_id: str
49+
_limits: AgentLimits
5650

5751
def __init__(
5852
self,
@@ -69,6 +63,7 @@ def __init__(
6963
logger: logging.Logger | None,
7064
conversation_store: ConversationStore | None,
7165
thread_id: str,
66+
limits: AgentLimits,
7267
) -> None:
7368
self._system_prompt = system_prompt
7469
self._model = model
@@ -79,26 +74,8 @@ def __init__(
7974
self._agents = tuple(agents) if agents else ()
8075
self._input_schema = input_schema
8176
self._output_schema = output_schema
82-
user_middleware = tuple(middleware) if middleware else ()
83-
user_middleware_types = {type(m) for m in user_middleware}
84-
85-
# NOTE: we're creating separate instances per agent - TimeoutLimitMiddleware is stateful
86-
# and sharing one would cause agents to overwrite each other's deadline.
87-
predefined_before: list[AgentMiddleware] = [
88-
StructuredOutputRetryLimitMiddleware(DEFAULT_STRUCTURED_OUTPUT_RETRY_LIMIT),
89-
]
90-
predefined_after: list[AgentMiddleware] = [
91-
TokenLimitMiddleware(DEFAULT_TOKEN_LIMIT),
92-
StepLimitMiddleware(DEFAULT_STEP_LIMIT),
93-
TimeoutLimitMiddleware(DEFAULT_TIMEOUT_SECONDS),
94-
]
95-
96-
self._middleware = (
97-
*{m for m in predefined_before if type(m) not in user_middleware_types},
98-
*user_middleware,
99-
*{m for m in predefined_after if type(m) not in user_middleware_types},
100-
)
101-
77+
self._limits = limits
78+
self._middleware = middleware
10279
self._trace_id = secrets.token_hex(16) # 32 Hex characters
10380
self._conversation_store = conversation_store
10481
self._thread_id = thread_id
@@ -177,3 +154,7 @@ def conversation_store(self) -> ConversationStore | None:
177154
@property
178155
def default_thread_id(self) -> str:
179156
return self._thread_id
157+
158+
@property
159+
def limits(self) -> AgentLimits:
160+
return self._limits

0 commit comments

Comments
 (0)