Summary
The IBM watsonx.ai Python SDK (ibm-watsonx-ai) is IBM's official client for the watsonx.ai platform, which hosts foundation models (Granite, Llama, Mistral, and others) on IBM Cloud and on-premises. The SDK provides a unique, non-OpenAI-compatible execution surface through ModelInference.generate(), ModelInference.chat(), streaming variants, and TextEmbeddings.embed(). This repository has zero instrumentation for any watsonx.ai SDK surface — no integration directory, no wrapper, no patcher, no auto_instrument() support.
Users who call ibm-watsonx-ai directly cannot use wrap_openai() or any other existing wrapper because ModelInference is a distinct client class with its own request/response schema. The IBM watsonx.ai API is not accessible through the Braintrust AI Proxy (which covers OpenAI-compatible endpoints). Enterprise users running watsonx.ai workloads get zero Braintrust spans today.
The SDK is actively maintained with frequent weekly releases (v1.5.12, December 2024). Comparable provider SDKs with dedicated native integrations in this repo: anthropic, cohere, mistralai, google-genai, huggingface-hub.
What needs to be instrumented
The ibm-watsonx-ai package exposes these execution surfaces via ModelInference, none of which are instrumented:
Text generation (highest priority)
| SDK Method |
Description |
Streaming |
ModelInference.generate(prompt, ...) |
Single-prompt text generation |
No |
ModelInference.generate_stream(prompt, ...) |
Streaming text generation |
Generator of dicts |
Response shape: generate() returns a dict with results[0].generated_text, results[0].generated_token_count, results[0].input_token_count, results[0].stop_reason, and results[0].seed. Token counts are directly available for span metrics.
Chat completions
| SDK Method |
Description |
Streaming |
ModelInference.chat(messages, ...) |
Chat completions (OpenAI-message-format input) |
No |
ModelInference.chat_stream(messages, ...) |
Streaming chat completions |
Generator of dicts |
Response shape: chat() returns a dict with choices[0].message.content, choices[0].finish_reason, usage.prompt_tokens, usage.completion_tokens, usage.total_tokens. This mirrors an OpenAI-like response but comes from a ModelInference instance, not an OpenAI client.
Embeddings
| SDK Method |
Description |
TextEmbeddings.embed(inputs, ...) |
Generate embeddings for a list of texts |
Return type: dict with results[0].embedding (list of floats) and results[0].input_token_count.
Implementation notes
Client instantiation: ModelInference takes a model_id string, credentials (API key + URL), and project_id or space_id. The model_id captures the foundation model used (e.g. "ibm/granite-13b-chat-v2", "meta-llama/llama-3-70b-instruct").
Auth: Uses IBM Cloud IAM tokens or API keys (not SigV4). VCR cassettes will need IBM IAM auth header sanitization.
No async client: The standard ibm-watsonx-ai library is synchronous. Async support may be added in a follow-up.
Parameters relevant for span metadata: model_id, params (contains max_new_tokens, temperature, top_p, top_k, repetition_penalty, stop_sequences, decoding_method).
Proposed span shape
generate() / generate_stream()
| Span field |
Content |
| input |
prompt |
| output |
generated_text from first result |
| metadata |
provider: "ibm_watsonx", model (from model_id), generation params |
| metrics |
tokens, prompt_tokens, completion_tokens |
chat() / chat_stream()
| Span field |
Content |
| input |
messages |
| output |
choices[0].message.content |
| metadata |
provider: "ibm_watsonx", model (from model_id), generation params |
| metrics |
tokens, prompt_tokens, completion_tokens |
No coverage in any instrumentation layer
- No integration directory (
py/src/braintrust/integrations/watsonx/)
- No wrapper function (e.g.
wrap_watsonx())
- No patcher in any existing integration
- No nox test session (
test_watsonx)
- No version entry in
py/src/braintrust/integrations/versioning.py
- No mention in
py/src/braintrust/integrations/__init__.py
A grep for watsonx, ibm_watsonx, or ibm-watsonx across py/src/braintrust/ returns zero matches.
Braintrust docs status
not_found — IBM watsonx.ai is not listed on the Braintrust AI providers page or the tracing guide. A direct docs page (/docs/integrations/ai-providers/watsonx) returns 404. There is no proxy path documented for watsonx.ai (which requires IBM Cloud IAM auth, not an OpenAI-compatible endpoint).
Upstream references
Local repo files inspected
py/src/braintrust/integrations/ — no watsonx/ directory on main
py/src/braintrust/wrappers/ — no watsonx wrapper
py/noxfile.py — no test_watsonx session
py/pyproject.toml [tool.braintrust.matrix] — no watsonx entry
py/src/braintrust/integrations/__init__.py — watsonx not listed
py/src/braintrust/integrations/versioning.py — no watsonx version matrix
- Full repo grep for
watsonx, ibm_watsonx, ibm-watsonx — zero matches in SDK source
Summary
The IBM watsonx.ai Python SDK (
ibm-watsonx-ai) is IBM's official client for the watsonx.ai platform, which hosts foundation models (Granite, Llama, Mistral, and others) on IBM Cloud and on-premises. The SDK provides a unique, non-OpenAI-compatible execution surface throughModelInference.generate(),ModelInference.chat(), streaming variants, andTextEmbeddings.embed(). This repository has zero instrumentation for any watsonx.ai SDK surface — no integration directory, no wrapper, no patcher, noauto_instrument()support.Users who call
ibm-watsonx-aidirectly cannot usewrap_openai()or any other existing wrapper becauseModelInferenceis a distinct client class with its own request/response schema. The IBM watsonx.ai API is not accessible through the Braintrust AI Proxy (which covers OpenAI-compatible endpoints). Enterprise users running watsonx.ai workloads get zero Braintrust spans today.The SDK is actively maintained with frequent weekly releases (v1.5.12, December 2024). Comparable provider SDKs with dedicated native integrations in this repo:
anthropic,cohere,mistralai,google-genai,huggingface-hub.What needs to be instrumented
The
ibm-watsonx-aipackage exposes these execution surfaces viaModelInference, none of which are instrumented:Text generation (highest priority)
ModelInference.generate(prompt, ...)ModelInference.generate_stream(prompt, ...)Generatorof dictsResponse shape:
generate()returns a dict withresults[0].generated_text,results[0].generated_token_count,results[0].input_token_count,results[0].stop_reason, andresults[0].seed. Token counts are directly available for span metrics.Chat completions
ModelInference.chat(messages, ...)ModelInference.chat_stream(messages, ...)Generatorof dictsResponse shape:
chat()returns a dict withchoices[0].message.content,choices[0].finish_reason,usage.prompt_tokens,usage.completion_tokens,usage.total_tokens. This mirrors an OpenAI-like response but comes from aModelInferenceinstance, not an OpenAI client.Embeddings
TextEmbeddings.embed(inputs, ...)Return type: dict with
results[0].embedding(list of floats) andresults[0].input_token_count.Implementation notes
Client instantiation:
ModelInferencetakes amodel_idstring,credentials(API key + URL), andproject_idorspace_id. Themodel_idcaptures the foundation model used (e.g."ibm/granite-13b-chat-v2","meta-llama/llama-3-70b-instruct").Auth: Uses IBM Cloud IAM tokens or API keys (not SigV4). VCR cassettes will need IBM IAM auth header sanitization.
No async client: The standard
ibm-watsonx-ailibrary is synchronous. Async support may be added in a follow-up.Parameters relevant for span metadata:
model_id,params(containsmax_new_tokens,temperature,top_p,top_k,repetition_penalty,stop_sequences,decoding_method).Proposed span shape
generate()/generate_stream()promptgenerated_textfrom first resultprovider: "ibm_watsonx",model(frommodel_id), generation paramstokens,prompt_tokens,completion_tokenschat()/chat_stream()messageschoices[0].message.contentprovider: "ibm_watsonx",model(frommodel_id), generation paramstokens,prompt_tokens,completion_tokensNo coverage in any instrumentation layer
py/src/braintrust/integrations/watsonx/)wrap_watsonx())test_watsonx)py/src/braintrust/integrations/versioning.pypy/src/braintrust/integrations/__init__.pyA grep for
watsonx,ibm_watsonx, oribm-watsonxacrosspy/src/braintrust/returns zero matches.Braintrust docs status
not_found— IBM watsonx.ai is not listed on the Braintrust AI providers page or the tracing guide. A direct docs page (/docs/integrations/ai-providers/watsonx) returns 404. There is no proxy path documented for watsonx.ai (which requires IBM Cloud IAM auth, not an OpenAI-compatible endpoint).Upstream references
Local repo files inspected
py/src/braintrust/integrations/— nowatsonx/directory onmainpy/src/braintrust/wrappers/— no watsonx wrapperpy/noxfile.py— notest_watsonxsessionpy/pyproject.toml[tool.braintrust.matrix]— no watsonx entrypy/src/braintrust/integrations/__init__.py— watsonx not listedpy/src/braintrust/integrations/versioning.py— no watsonx version matrixwatsonx,ibm_watsonx,ibm-watsonx— zero matches in SDK source