Open
Conversation
Adds a BRAINTRUST_ENABLED environment variable (default: true) that gates all Braintrust tracing, logging, and prompt fetching. When set to false, the system skips SDK setup, span logging, TracedThreadPool, flush calls, and remote prompt fetching — reducing noise and overhead during load tests. Fixes SC-41751 Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Rename 'inner' to 'message_task' for clarity in send_message() - Move braintrust import to top of prompt_service.py Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Introduced a new mock Anthropic API server for load testing, allowing simulation of API calls without incurring costs. - Updated `docker-compose.yml` to include the mock server and its health checks. - Added load testing scripts and configuration files to facilitate performance testing of the chatbot server. - Created documentation for usage and configuration of the load testing setup. - Implemented tests for the mock server to ensure correct behavior and response formats. This enhances the testing framework and allows for more efficient performance evaluations.
…re/sc-41751/implement-environment-variable-to-skip
…iles - Renamed environment variable from `BRAINTRUST_ENABLED` to `BRAINTRUST_LOGGING_ENABLED` for clarity in tracing control. - Updated related documentation and code references to reflect the new variable name. - Added Kubernetes deployment files for the application, mock Anthropic API, and PostgreSQL, including necessary configurations and health checks. - Introduced a secrets example file for managing sensitive information. This enhances the deployment process and improves the clarity of Braintrust logging settings.
- Removed obsolete files from the .claude directory, including configuration and context files. - Updated .gitignore to exclude new directories and files related to Claude's configuration and secrets management. This cleanup improves project organization and ensures sensitive files are not tracked.
- Introduced a new settings.json file in the .claude directory to manage plugin permissions and configurations. - Updated .gitignore to ensure settings.json is not ignored, allowing it to be tracked. This addition enhances the configuration management for Claude plugins.
akiva10b
reviewed
Feb 17, 2026
akiva10b
reviewed
Feb 17, 2026
Contributor
akiva10b
left a comment
There was a problem hiding this comment.
Arc change: you can use the function get_agent_service which does a new init on every claud agent sdk to determine which api endpoint we want to use. In this way, you dont need to run another server but rather can just stipulate in the request how you want the agent to act
…tern - Remove scattered Braintrust if-else guards; rely on SDK no-op semantics when BRAINTRUST_LOGGING_ENABLED=false (current_span() returns noop span) - Add get_agent_service(is_load_testing) factory: routes to mock Anthropic server when true, real API when false - Add _IS_LOAD_TESTING module-level constant in views.py and anthropic_views.py - Replace ANTHROPIC_BASE_URL env var with IS_LOAD_TESTING + MOCK_ANTHROPIC_URL in docker-compose.yml - Delete k8s/ directory; update README and docs to remove k8s references - Update BRAINTRUST_TRACING.md to reflect actual no-op behavior Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Renamed `braintrust_enabled` to `braintrust_logging_enabled` across multiple files to improve clarity in tracing control. - Removed conditional guards related to `braintrust_logging_enabled` in the `send_message` and `_send_message_inner` methods, simplifying the logic. - Updated documentation to reflect the changes in variable naming and behavior. This refactor enhances the maintainability of the code and aligns with the updated logging configuration.
…actory pattern - Removed scattered if-else guards related to Braintrust logging, simplifying the logic by relying on SDK no-op semantics. - Introduced a factory pattern for `get_agent_service(is_load_testing)` to switch between mock and real Anthropic endpoints without environment variable changes. - Updated `ClaudeAgentService` to accept a `base_url` parameter, allowing for flexible endpoint configuration. - Simplified logging and tracing logic across multiple files, enhancing maintainability and clarity. This refactor improves the overall structure and usability of the logging and agent service components.
Replace IS_LOAD_TESTING process-level env var with a per-request boolean. Load test script passes isLoadTest:true in the request body; views.py uses that flag to call get_agent_service(is_load_testing=...). Removes IS_LOAD_TESTING from docker-compose.yml and all docs. Also: - Fix AES-GCM nonce to use os.urandom(12) instead of fixed all-zero bytes - Add tests for isLoadTest serializer field, get_agent_service factory, BraintrustConfig.enabled, and _setup_braintrust_tracing early-return Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
akiva10b
approved these changes
Feb 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds load testing for the chatbot server without using the real Anthropic API. A mock Anthropic server replaces the real API so you can measure capacity (threads, memory, DB, streaming) at no cost. Includes Docker Compose setup and Braintrust tracing changes for load-test runs.
What’s new
Load testing stack
server/loadtest/mock_anthropic.py) — FastAPI server that mimics the Anthropic Messages API (SSE streaming, tool-calling). Uses the same event types and payloads as the real API so the Claude Agent SDK runs end-to-end.server/loadtest/load_test.py) — Async httpx script that sends concurrent requests to/api/v2/chat/streamand reports TTFB, total response time, error rate, and throughput.server/loadtest/test_mock_anthropic.py) — Pytest tests for the mock’s SSE format and tool-calling behavior.Docker Compose
mock-anthropic— Runs the mock Anthropic API on port 8002.loadtest— Runs the load test against the app (default: 50 requests, 10 concurrent).app— UsesIS_LOAD_TESTING=trueandMOCK_ANTHROPIC_URLto route to the mock instead of the real API.Agent service factory
get_agent_service(is_load_testing)— Chooses mock vs real Anthropic based onIS_LOAD_TESTING.ClaudeAgentService— Accepts optionalbase_urlfor mock routing.IS_LOAD_TESTING+MOCK_ANTHROPIC_URL— ReplacesANTHROPIC_BASE_URLfor load-test configuration.Braintrust tracing
BRAINTRUST_LOGGING_ENABLED— Can be set tofalseto disable tracing during load tests.if bt_span:.flush_braintrust()— Always callsbraintrust.flush()(no-op when logging is disabled).How to run
docker compose up -d --build docker compose exec app python manage.py migrate docker compose run --rm loadtestFiles changed
server/loadtest/— mock, load script, tests, Dockerfiles, READMEdocker-compose.yml— mock-anthropic, loadtest, app env varsclaude_service.py,views.py,anthropic_views.py,utils.pyclaude_service.py,utils.py,BRAINTRUST_TRACING.mddocs/ARCHITECTURE.md,docs/plans/braintrust-factory-refactor.md