feat(cloudflare): add vision support (Gemma 4, Llama 4 Scout, Llama 3.2 Vision)#43
Merged
stackbilt-admin merged 1 commit intomainfrom Apr 17, 2026
Merged
Conversation
…ma 3.2 Vision CloudflareProvider previously advertised no vision capability; any request.images sent via the CF path was silently dropped. This wires request.images through to CF's OpenAI-compatible image_url content-part shape (base64 data URIs) and ships three vision-capable model entries. Factory's analyzeImage() now dispatches to CF when configured, and getDefaultVisionModel() falls back to @cf/google/gemma-4-26b-a4b-it for CF-only deployments. Non-vision models and HTTP image URLs throw ConfigurationError with the vision-capable alternatives named. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
request.imagessent via the CF path was silently dropped. This wires image input through to CF's OpenAI-compatibleimage_urlcontent-part shape (base64 data URIs).@cf/google/gemma-4-26b-a4b-it(256K ctx + tools),@cf/meta/llama-4-scout-17b-16e-instruct(multimodal + tools),@cf/meta/llama-3.2-11b-vision-instruct.analyzeImage()now dispatches to CF when configured;getDefaultVisionModel()falls back to Gemma 4 for CF-only deployments. Non-vision models and HTTP image URLs throwConfigurationErrorwith the vision-capable alternatives named.Bumps to v1.3.0 (additive, strict semver).
Motivation
Reported by the foodfiles team —
analyzeImage()routed to CF would silently drop the image. Options evaluated: call CF AI directly (strands the abstraction), fall back to Anthropic vision (works but costs per image), or properly add CF vision to the library. This PR is the proper fix.Changes
src/providers/cloudflare.ts(+104/-3):supportsVision = true, three new model entries ingetModelCapabilities(),attachImagesToLastUserMessage+buildImageDataUrlhelpers, validation for non-vision model + HTTP URL.src/factory.ts(+1): CF fallback ingetDefaultVisionModel.src/__tests__/cloudflare.test.ts(+93): 6 new tests — payload shape, multi-image, data-URL passthrough, HTTP-URL rejection, non-vision rejection, capability flags.CHANGELOG.md,package.json: v1.3.0 entry.Test plan
npm run typecheck— cleannpm test— 219/219 passing (14 cloudflare tests, 6 new)@cf/google/gemma-4-26b-a4b-itonce publishedConsumer usage
🤖 Generated with Claude Code