Skip to content

[bot] Add IBM watsonx.ai Python SDK integration for ModelInference generate, chat, and embedding instrumentation #480

@braintrust-bot

Description

@braintrust-bot

Summary

The IBM watsonx.ai Python SDK (ibm-watsonx-ai) is IBM's official client for the watsonx.ai platform, which hosts foundation models (Granite, Llama, Mistral, and others) on IBM Cloud and on-premises. The SDK provides a unique, non-OpenAI-compatible execution surface through ModelInference.generate(), ModelInference.chat(), streaming variants, and TextEmbeddings.embed(). This repository has zero instrumentation for any watsonx.ai SDK surface — no integration directory, no wrapper, no patcher, no auto_instrument() support.

Users who call ibm-watsonx-ai directly cannot use wrap_openai() or any other existing wrapper because ModelInference is a distinct client class with its own request/response schema. The IBM watsonx.ai API is not accessible through the Braintrust AI Proxy (which covers OpenAI-compatible endpoints). Enterprise users running watsonx.ai workloads get zero Braintrust spans today.

The SDK is actively maintained with frequent weekly releases (v1.5.12, December 2024). Comparable provider SDKs with dedicated native integrations in this repo: anthropic, cohere, mistralai, google-genai, huggingface-hub.

What needs to be instrumented

The ibm-watsonx-ai package exposes these execution surfaces via ModelInference, none of which are instrumented:

Text generation (highest priority)

SDK Method Description Streaming
ModelInference.generate(prompt, ...) Single-prompt text generation No
ModelInference.generate_stream(prompt, ...) Streaming text generation Generator of dicts

Response shape: generate() returns a dict with results[0].generated_text, results[0].generated_token_count, results[0].input_token_count, results[0].stop_reason, and results[0].seed. Token counts are directly available for span metrics.

Chat completions

SDK Method Description Streaming
ModelInference.chat(messages, ...) Chat completions (OpenAI-message-format input) No
ModelInference.chat_stream(messages, ...) Streaming chat completions Generator of dicts

Response shape: chat() returns a dict with choices[0].message.content, choices[0].finish_reason, usage.prompt_tokens, usage.completion_tokens, usage.total_tokens. This mirrors an OpenAI-like response but comes from a ModelInference instance, not an OpenAI client.

Embeddings

SDK Method Description
TextEmbeddings.embed(inputs, ...) Generate embeddings for a list of texts

Return type: dict with results[0].embedding (list of floats) and results[0].input_token_count.

Implementation notes

Client instantiation: ModelInference takes a model_id string, credentials (API key + URL), and project_id or space_id. The model_id captures the foundation model used (e.g. "ibm/granite-13b-chat-v2", "meta-llama/llama-3-70b-instruct").

Auth: Uses IBM Cloud IAM tokens or API keys (not SigV4). VCR cassettes will need IBM IAM auth header sanitization.

No async client: The standard ibm-watsonx-ai library is synchronous. Async support may be added in a follow-up.

Parameters relevant for span metadata: model_id, params (contains max_new_tokens, temperature, top_p, top_k, repetition_penalty, stop_sequences, decoding_method).

Proposed span shape

generate() / generate_stream()

Span field Content
input prompt
output generated_text from first result
metadata provider: "ibm_watsonx", model (from model_id), generation params
metrics tokens, prompt_tokens, completion_tokens

chat() / chat_stream()

Span field Content
input messages
output choices[0].message.content
metadata provider: "ibm_watsonx", model (from model_id), generation params
metrics tokens, prompt_tokens, completion_tokens

No coverage in any instrumentation layer

  • No integration directory (py/src/braintrust/integrations/watsonx/)
  • No wrapper function (e.g. wrap_watsonx())
  • No patcher in any existing integration
  • No nox test session (test_watsonx)
  • No version entry in py/src/braintrust/integrations/versioning.py
  • No mention in py/src/braintrust/integrations/__init__.py

A grep for watsonx, ibm_watsonx, or ibm-watsonx across py/src/braintrust/ returns zero matches.

Braintrust docs status

not_found — IBM watsonx.ai is not listed on the Braintrust AI providers page or the tracing guide. A direct docs page (/docs/integrations/ai-providers/watsonx) returns 404. There is no proxy path documented for watsonx.ai (which requires IBM Cloud IAM auth, not an OpenAI-compatible endpoint).

Upstream references

Local repo files inspected

  • py/src/braintrust/integrations/ — no watsonx/ directory on main
  • py/src/braintrust/wrappers/ — no watsonx wrapper
  • py/noxfile.py — no test_watsonx session
  • py/pyproject.toml [tool.braintrust.matrix] — no watsonx entry
  • py/src/braintrust/integrations/__init__.py — watsonx not listed
  • py/src/braintrust/integrations/versioning.py — no watsonx version matrix
  • Full repo grep for watsonx, ibm_watsonx, ibm-watsonx — zero matches in SDK source

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions