Skip to content

Comments

Live provider integration tests and real HTTP client#3

Open
tomjnsn wants to merge 92 commits intoevmts:mainfrom
tomjnsn:feature/41-live-provider-integration-tests
Open

Live provider integration tests and real HTTP client#3
tomjnsn wants to merge 92 commits intoevmts:mainfrom
tomjnsn:feature/41-live-provider-integration-tests

Conversation

@tomjnsn
Copy link

@tomjnsn tomjnsn commented Feb 11, 2026

Summary

  • Implement real StdHttpClient using Zig 0.15's std.http.Client with full TLS/HTTPS support, replacing the previous stub
  • Add zig build test-live step with 9 integration tests (OpenAI, Azure, xAI x generateText/streamText/error-diagnostic) that skip gracefully when API keys are absent
  • Fix google-vertex relative path imports, anthropic vtable type mismatch, and google response_format handling

Test plan

  • zig build test passes (all existing unit tests, no regression)
  • zig build test-live compiles and all 9 tests skip gracefully without env vars
  • zig build test-live with OPENAI_API_KEY set: 3 OpenAI tests pass
  • zig build test-live with AZURE_API_KEY/AZURE_RESOURCE_NAME/AZURE_DEPLOYMENT_NAME set: 3 Azure tests pass
  • zig build test-live with XAI_API_KEY set: 3 xAI tests pass

Closes #41

🤖 Generated with Claude Code

tomjnsn and others added 30 commits February 3, 2026 16:12
… improve memory safety

- Standardize http_client field to typed ?provider_utils.HttpClient across 34 files
- Add MockHttpClient for testing providers without network requests
- Remove singleton pattern from 26 providers (fixes memory leaks)
- Add getHeaders() functions to 11 providers missing them
- Fix silent error suppression (catch continue) in 5 locations
- Document vtable pattern, pointer casting, and memory ownership in CLAUDE.md
- Update HttpClient interface documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ance tests

- Update Anthropic API version to 2024-06-01
- Implement full HTTP layer for Google language/embedding/image models
- Implement full HTTP layer for Vertex embedding/image models
- Create response types for Google (GoogleGenerateContentResponse, etc.)
- Create response types for Vertex (VertexPredictEmbeddingResponse, etc.)
- Add comprehensive compliance tests for OpenAI/Anthropic/Azure
- Add HTTP integration tests for Google and Vertex providers
- Fix Vertex embedding callback bug (incorrect parameter order)
- Update README with HTTP client docs and recent changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements redactApiKey() and containsApiKey() functions that detect
and redact sensitive API key patterns (sk-, sk-proj-, anthropic-sk-ant-)
from text to prevent credential leakage in error messages and logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move security.zig to provider package (fixes circular dependency since
api-call-error.zig is in provider). Re-export from provider-utils.
Add test verifying format() redacts API keys in response body.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ApiCallError.format() now redacts sensitive API key patterns from
response body before including it in the formatted error output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add max_response_size field to Request, PostJsonToApiOptions, and
PostToApiOptions. Add response_too_large error kind. Test verifies
responses exceeding the limit are rejected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
postJsonToApi and postToApi now check response body size against
max_response_size and return an error if exceeded, preventing DoS
via memory exhaustion from oversized responses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test verifies EventSourceParser rejects data that exceeds the
configured max_buffer_size, returning BufferLimitExceeded error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EventSourceParser now supports max_buffer_size via initWithMaxBuffer().
Returns error.BufferLimitExceeded when incoming data would exceed the
configured limit, preventing memory exhaustion from malicious streams.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change HttpClient.post() from silently truncating headers at 64 to
returning error.TooManyHeaders, preventing silent data loss.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadApiKey/loadOptionalSetting use std.heap.page_allocator for
getEnvVarOwned but return the result to callers who don't know
which allocator to use for freeing. This causes a mismatch between
allocation and deallocation allocators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add allocator field to LoadApiKeyOptions and LoadSettingOptions
(defaults to page_allocator for backward compat). Fix all env var
lookups to use the passed allocator. Fix memory leak where empty
env values weren't freed before returning error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests cover HTTPS enforcement, malformed URL rejection, HTTP override,
and URL normalization (duplicate slash removal).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
validateUrl() enforces HTTPS-only (with allow_http override) and
rejects empty, schemeless, and non-HTTP(S) URLs. normalizeUrl()
collapses duplicate path slashes while preserving scheme://.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
loadOpenAIStyleConfig now validates base URLs using validateUrl(),
rejecting non-HTTP(S) schemes and malformed URLs before use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed headers_fn return type to error union, replaced catch {} with try
in all header operations. Updated test helper function signatures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added try to all http_client.post() calls which now returns !void
after the header count limit change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed headers_fn return types to error{OutOfMemory}!, replaced catch {}
with try in all header put operations. Updated all call sites and test helpers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed headers_fn return type, replaced catch {} with proper error
handling via callbacks in language model, image model, and embedding model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaced catch {} with proper error handling across 41 files:
- Updated headers_fn types to error{OutOfMemory}! in all config files
- Changed getHeaders functions to use try instead of catch {}
- Added errdefer for proper cleanup on allocation failure
- Fixed callback-based functions to report errors via callbacks
- Removed dead code in openai-error.zig

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Recording failures in MockHttpClient are non-critical but should be
visible during debugging. Changed catch {} to catch with log.warn.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added safeCast() function that uses std.math.cast to safely convert
integers, returning error.IntegerOverflow instead of undefined behavior.
Exported from provider-utils.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaced @intcast with safeCast for all external/untrusted data:
API parameters (max_tokens, seed, etc.), timestamps, and user-provided
values. Only compile-time safe casts in json-value.zig remain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added lifetime requirements to LanguageModelV3, EmbeddingModelV3, and
HttpClient. Created lifetime_example.zig showing correct/incorrect patterns.
All unreachable usages verified to be in test code only.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test verifies generateText calls model.doGenerate and returns the response
text. Currently fails because generateText returns placeholder data instead
of calling the model. (RED step of TDD)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace placeholder with actual model.doGenerate vtable call:
- Convert ai-level Messages to provider-level LanguageModelV3Prompt
- Build LanguageModelV3CallOptions with settings mapping
- Use synchronous callback pattern to capture GenerateResult
- Extract text from content, map finish reason and usage
- Pass caller's allocator to model for result data ownership

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test verifies streamText calls model.doStream and delivers text chunks
via callbacks. Currently fails because streamText returns placeholder
instead of calling the model. (RED step of TDD)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace placeholder with actual model.doStream vtable call:
- Build prompt and call options (same pattern as generateText)
- Create bridge context to translate LanguageModelV3StreamPart to
  ai-level StreamPart (text_delta, finish, error mapping)
- Accumulate text in StreamTextResult via processPart
- Forward translated parts to ai-level callbacks
- Use errdefer for clean result cleanup on error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RED phase - test expects mock embeddings but embed() returns placeholder empty values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces TODO placeholder with actual model.doEmbed call. Converts
provider-level f32 embeddings to ai-level f64 values. Maps usage
and model metadata from provider response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tomjnsn and others added 30 commits February 11, 2026 11:07
Add errdefer cleanup to JsonValue.fromStdJson and clone
…der extraction

extractResponseHeadersSlice() and extractResponseHeaders() leaked
previously-duped strings if allocation failed mid-iteration. Added
proper errdefer to clean up partial allocations on error paths.

Closes #7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add errdefer cleanup for partial allocs in header extraction
generateText() was passing the base allocator to model.doGenerate(),
but using arena_allocator internally. Now passes arena_allocator so
provider temp allocations are cleaned up by the arena. Result strings
(text, response id/model_id) are duped to the base allocator so they
outlive the arena. Updated deinit to free these owned strings.

Closes #8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…atch

Fix allocator mismatch in generateText doGenerate call
Use idiomatic .empty instead of {} for ArrayList initialization.
The original bug (Managed without .init(allocator)) was fixed by the
ArrayList migration in #3, but this makes the pattern consistent.

Closes #9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use .empty for MockHttpClient ArrayList init
Object keys were written without escaping special characters (quotes,
backslashes, control chars), producing malformed JSON. Extracted string
escaping into writeJsonString() helper and reuse for both string values
and object keys. Added test for keys with special characters.

Closes #10

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Escape JSON object keys in stringifyTo
Added detection prefixes for Google (AIza), AWS (AKIA), Mistral (msk-),
Cohere (co-), Groq (gsk_), and xAI (xai-) API keys. Reordered
prefixes so sk-proj- is checked before sk- to avoid partial matches.

Closes #11

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Expand API key detection to cover more providers
The pre-append check using projected size (current + incoming > max)
is correct and prevents unnecessary allocation. Improved the comment
to document why the check happens before appending.

Closes #12

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify buffer growth check in EventSourceParser
The .finish event handler was overwriting total_usage with the finish
event's value, discarding any previously accumulated usage from
step_finish events. Now uses .add() to accumulate, matching the
step_finish pattern.

Closes #13

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Accumulate stream usage on finish instead of overwriting
…ject

std.json.Stringify.valueAlloc exists in Zig 0.15+ and is the correct
API. Updated comment to document this for clarity.

Closes #14

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix(ai): clarify std.json.Stringify.valueAlloc usage
Introduce stack-allocated ErrorDiagnostic following the idiomatic Zig
"Diagnostics out-parameter" pattern (same as std.json.Scanner.Diagnostics).
Provides HTTP status codes, error classification, retry hints, and
error messages to callers without requiring allocator or deinit.

Closes #29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat(provider): add ErrorDiagnostic type for rich error context
…rovider call sites

post() was non-functional: used anytype callbacks, had TODO comment, and
discarded all responses/errors via empty internal lambdas. Fix changes
callback params to concrete function pointer types matching request()
vtable signatures and properly forwards to self.request(). Updates all 6
provider call sites (OpenAI×5, Anthropic×1) to use correct
Response/HttpError callback types and adds HTTP status code checking.

Closes #30

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix HttpClient.post() and update provider call sites
…d API Options

Adds `error_diagnostic: ?*ErrorDiagnostic = null` to all 5 provider-level
CallOptions structs and all 8 high-level API Options structs, enabling
opt-in rich error context throughout the SDK.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…options

feat: add error_diagnostic to all CallOptions and API Options
…lures

Updates all 9 provider files (OpenAI×5, Anthropic×1, Google×3) to capture
HTTP errors and non-2xx responses into the ErrorDiagnostic out-parameter.
Providers now set status_code, kind, message, provider name, and response
body on failure instead of silently discarding error details.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: populate ErrorDiagnostic on HTTP failures in all providers
…CallOptions

Forwards the error_diagnostic out-parameter from all 8 high-level API
Options structs through to the provider-level CallOptions, completing
the diagnostic pipeline from caller to provider HTTP layer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ugh-apis

feat: thread error_diagnostic through high-level APIs to providers
Tests verify full diagnostic pipeline: MockHttpClient → provider → ErrorDiagnostic
- HTTP 429 rate limit: status_code, kind=rate_limit, retryable, JSON message extraction
- HTTP 401 auth error: kind=authentication, non-retryable
- Network error (connection_failed): kind=network, message propagation
- HTTP 500 server error: kind=server_error, retryable, non-JSON body fallback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement StdHttpClient using Zig 0.15's std.http.Client for real HTTP/HTTPS
requests with TLS support, replacing the previous stub. Add `zig build test-live`
step with 9 integration tests (OpenAI, Azure, xAI × generateText/streamText/
error-diagnostic) that skip gracefully when API keys are absent.

Also fixes:
- google-vertex: replace relative path imports with proper module imports
- anthropic: fix GenerateResult type mismatch in vtable callback
- anthropic: use std.io.Writer.Allocating for JSON serialization
- google: fix response_format handling (non-optional tagged union)

Closes #41

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant