@@ -613,7 +613,7 @@ triggers the retry logic described above. A custom `model_middleware` can interc
613613to observe, log, or override the retry behavior. A custom ` model_middleware ` can also raise
614614the ` StructuredOutputGenerationException ` manually to reject structured output and force a re-generation.
615615
616- The maximal number of re-tries is limited per agent loop invocation see [ Default limit middlewares ] ( #default-limit-middlewares ) .
616+ The maximal number of re-tries is limited per agent loop invocation see [ Default limits ] ( #default-limits ) .
617617
618618### Subagents with structured output/input
619619
@@ -894,103 +894,87 @@ model = OpenAIModel(...)
894894service = connect(... )
895895
896896@before_model
897- def log_usage (req : ModelRequest) -> None :
898- logger.debug(f " Steps: { req.state.total_steps } , Tokens: { req.state.token_count } " )
897+ def log_steps (req : ModelRequest) -> None :
898+ logger.debug(f " Steps: { len ( req.state.messages) } " )
899899
900900
901901async with Agent(
902902 model = model,
903903 service = service,
904904 system_prompt = " ..." ,
905- middleware = [log_usage ],
905+ middleware = [log_steps ],
906906) as agent: ...
907907```
908908
909- The hooks can stop the Agentic Loop under custom conditions by raising exceptions.
910- The logic of the hook can be more advanced and include multiple conditions, for example, based on both token usage and execution time:
909+ The hooks can stop the Agentic Loop under custom conditions by raising exceptions, for example:
911910
912911``` py
913912from splunklib.ai.hooks import before_model
914913from splunklib.ai.middleware import AgentMiddleware, ModelRequest
915914
916- def token_and_step_limit ( token_limit : float , step_limit : int ) -> AgentMiddleware:
915+ def token_and_message_limit ( message_limit : int ) -> AgentMiddleware:
917916 @before_model
918917 def _hook (req : ModelRequest) -> None :
919- if req.state.token_count > token_limit or req.state.total_steps >= step_limit :
918+ if len ( req.state.messages) >= message_limit :
920919 raise Exception (" Stopping Agentic Loop" )
921920
922921 return _hook
923922
924923
925924async with Agent(
926925 ... ,
927- middleware = [token_and_step_limit( token_limit = 10_000 , step_limit = 5 )],
926+ middleware = [token_and_message_limit( message_limit = 5 )],
928927) as agent: ...
929928```
930929
931- ## Default limit middlewares
930+ ## Default limits
932931
933932Every ` Agent ` automatically applies sane default limits to prevent runaway execution
934- or excessive token usage. Default limit middlewares are appended after any user-supplied
935- middleware, so they always act on the final state of the request. If you override one of
936- the defaults by passing your own instance, you are responsible for its position in the
937- chain - place it last if you want the same behavior.
933+ or excessive token usage.
938934
939- | Middleware | Default | Measured |
935+ | Limit | Default | Measured |
940936| ---| ---| ---|
941- | ` TokenLimitMiddleware ` | 200 000 tokens | token count of messages passed to the model |
942- | ` StepLimitMiddleware ` | 100 steps | steps taken |
943- | ` TimeoutLimitMiddleware ` | 600 seconds (10 minutes) | per ` invoke ` call |
944- | ` StructuredOutputRetryLimitMiddleware ` | 3 retries | per ` invoke ` call |
937+ | ` max_tokens ` | 200 000 tokens | token count of messages passed to the model |
938+ | ` max_steps ` | 100 steps | number of messages in the conversation |
939+ | ` timeout ` | 600 seconds (10 minutes) | per ` invoke ` call |
940+ | ` max_structured_output_retires ` | 3 retries | per ` invoke ` call |
945941
946- ` TokenLimitMiddleware ` and ` StepLimitMiddleware ` check the values from the messages passed to the
947- model on each call. ` TimeoutLimitMiddleware ` and ` StructuredOutputRetryLimitMiddlewa ` resets its
948- deadline/limit on each ` invoke ` , so effectively these limit only the agent loop.
942+ ` max_tokens ` and ` max_steps ` are checked against the messages passed to the model on each call.
943+ ` timeout ` and ` max_structured_output_retires ` reset on each ` invoke ` , so they limit only the
944+ current agent loop invocation .
949945
950946When a limit is exceeded, the agent raises the corresponding exception:
951- ` TokenLimitExceededException ` , ` StepsLimitExceededException ` , or ` TimeoutExceededException ` ,
947+ ` TokenLimitExceededException ` , ` StepsLimitExceededException ` , ` TimeoutExceededException ` , or
952948` StructuredOutputRetryLimitExceededException ` .
953949
954950### Overriding defaults
955951
956- To override a specific limit, pass your own instance of the corresponding middleware
957- class. The default for that limit is suppressed automatically - the other defaults
958- remain active:
952+ Limits are configured via the ` AgentLimits ` dataclass passed to the ` Agent ` constructor.
953+ Only the fields you specify are overridden; the rest keep their defaults:
959954
960955``` py
961- from splunklib.ai.limits import (
962- TokenLimitMiddleware,
963- StepLimitMiddleware,
964- TimeoutLimitMiddleware,
965- StructuredOutputRetryLimitMiddleware,
966- )
956+ from splunklib.ai.limits import AgentLimits
967957
968958async with Agent(
969959 ... ,
970- middleware = [
971- TokenLimitMiddleware(50_000 ), # overrides default 200 000; other defaults still apply
972- ],
960+ limits = AgentLimits(max_tokens = 50_000 ), # overrides default 200 000; other defaults still apply
973961) as agent: ...
974962```
975963
976- To override all defaults, pass all of these to Agent's middleware list :
964+ To override all defaults:
977965
978966``` py
979967async with Agent(
980968 ... ,
981- middleware = [
982- StructuredOutputRetryLimitMiddleware( 0 ), # no-retries.
983- TokenLimitMiddleware( 50_000 ) ,
984- StepLimitMiddleware( 10 ) ,
985- TimeoutLimitMiddleware( 30.0 ),
986- ] ,
969+ limits = AgentLimits(
970+ max_tokens = 50_000 ,
971+ max_steps = 10 ,
972+ timeout = 30.0 ,
973+ max_structured_output_retires = 0 , # no retries
974+ ) ,
987975) as agent: ...
988976```
989977
990- ** Note** : When overriding limit middlewares, order matters. Place ` StructuredOutputRetryLimitMiddleware `
991- first and ` TokenLimitMiddleware ` , ` StepLimitMiddleware ` , and ` TimeoutLimitMiddleware ` last,
992- otherwise the limits may not behave as expected.
993-
994978There is no explicit opt-out - the intent is that agents should always have some guardrails.
995979
996980## Logger
0 commit comments