Convert model output image file to png then save/moderate#72108
Convert model output image file to png then save/moderate#72108fisher-alice wants to merge 2 commits intoalice/scale-image-moderationfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent Azure AI Content Safety “image format not supported” 400s by normalizing model-generated images to PNG before uploading them as project assets and sending them to moderation.
Changes:
- Convert Vercel AI SDK
GeneratedFileimages to a browserFilein PNG format via canvas. - Update asset upload helper to accept a browser
File(instead ofGeneratedFile) and upload usingfile.type. - Simplify image moderation helper to accept a browser
Filedirectly.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| apps/src/aichat/api/client/generateChatResponse.ts | Adds client-side PNG conversion and uses the converted File for upload + moderation. |
| apps/src/aichat/api/client/helpers/fileHelpers.ts | Changes generated-file upload to accept a browser File and removes prepareGeneratedFile. |
| apps/src/aichat/api/client/helpers/safetyHelpers.ts | Changes moderation helper to accept a browser File directly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| new Promise((resolve, reject) => { | ||
| const blob = new Blob([new Uint8Array(file.uint8Array)], { | ||
| type: file.mediaType, | ||
| }); |
There was a problem hiding this comment.
convertToPng sets the Blob MIME type from file.mediaType, but the PR description notes this value can be inaccurate. If the type is wrong, new Image() may fail to decode the object URL and the whole request will throw. Consider omitting the Blob type (or using a safe fallback) so the browser can sniff the bytes, and handle decode failures gracefully.
There was a problem hiding this comment.
Good feedback - will omit mediaType.
| const canvas = document.createElement('canvas'); | ||
| canvas.width = img.naturalWidth; | ||
| canvas.height = img.naturalHeight; | ||
| canvas.getContext('2d')?.drawImage(img, 0, 0); |
There was a problem hiding this comment.
convertToPng draws with canvas.getContext('2d')?.drawImage(...) but doesn’t check whether the 2D context was actually created. If getContext('2d') returns null, this will silently produce a blank PNG that still gets uploaded and moderated. It’d be safer to reject (or throw) when the context is missing.
| canvas.getContext('2d')?.drawImage(img, 0, 0); | |
| const context = canvas.getContext('2d'); | |
| if (!context) { | |
| URL.revokeObjectURL(url); | |
| reject(new Error('canvas 2d context creation failed')); | |
| return; | |
| } | |
| context.drawImage(img, 0, 0); |
There was a problem hiding this comment.
makes sense - will update
| // Gemini API output format is not configurable, so we convert to PNG for consistent asset storage and moderation. | ||
| const fileToSave = await convertToPng(file); | ||
|
|
||
| let asset: ChatAsset; | ||
| try { |
There was a problem hiding this comment.
await convertToPng(file) is outside the existing try/catch. If image decode or canvas.toBlob fails, this rejection will throw out of generateChatResponse, and performClientApiChatCompletion doesn’t catch it—so the user may lose the text response. Consider wrapping the PNG conversion in the same error-handling path as asset upload (log + continue) to preserve the current “return text even if file handling fails” behavior.
| // Gemini API output format is not configurable, so we convert to PNG for consistent asset storage and moderation. | |
| const fileToSave = await convertToPng(file); | |
| let asset: ChatAsset; | |
| try { | |
| let fileToSave: GeneratedFile; | |
| let asset: ChatAsset; | |
| try { | |
| // Gemini API output format is not configurable, so we convert to PNG for consistent asset storage and moderation. | |
| fileToSave = await convertToPng(file); |
There was a problem hiding this comment.
Adding another try-catch for possible conversion error (but not combining) in order to log specific error.
| // Also, Vercel AI SDK does not always report media type accurately (seen from HoneyBadger Azure error reports). | ||
| const convertToPng = (file: GeneratedFile): Promise<File> => | ||
| new Promise((resolve, reject) => { | ||
| const blob = new Blob([new Uint8Array(file.uint8Array)], { |
There was a problem hiding this comment.
new Uint8Array(file.uint8Array) copies the image bytes before creating the Blob. Since file.uint8Array is already a Uint8Array, you can pass it directly to Blob to avoid an extra allocation/copy (which can be noticeable for large images).
| const blob = new Blob([new Uint8Array(file.uint8Array)], { | |
| const blob = new Blob([file.uint8Array], { |
There was a problem hiding this comment.
TS error otherwise.
| // Converts any browser-renderable image to PNG via canvas. | ||
| // Used with model-generated images since Vercel AI SDK doesn't currently expose output format configuration. | ||
| // Also, Vercel AI SDK does not always report media type accurately (seen from HoneyBadger Azure error reports). | ||
| const convertToPng = (file: GeneratedFile): Promise<File> => |
There was a problem hiding this comment.
I wonder, should do this on the gateway side? cc @edcodedotorg
And/or should we split this into two operations: 1) check actual file type (using magic bytes etc) and report discrepancies and 2) convert unsupported file types to a supported file type (PNG in this case)?
I am a bit confused by the above.
@sanchitmalhotra126 I think our moderation (generally and image specifically) being a concern of the gateway is a real possibility but we need to get crisp about what our story is..
|
|
My current understanding of the situation is that the declared
100%. Re:
As far as I can tell, the Gemini API does not allow us to specify an output format for image types (based on some forum comments it seems to default to JPEG for speed). Whether or not we absorb this into the gateway right away, I think our code somewhere needs to answer the following questions:
After which, we can decide to report on any discrepancies and/or convert to a known format (or just discard if we want to treat this as a generation error). |
This PR converts model-generated images to PNG files to avoid the following reported error:
Example report
We use Vercel AI SDK for client implementation of generating assistant responses when Gemini Flash Image 2.5 model is selected. Currently, the media type is included in the returned generated file, but seems to not always match the actual file type as we return early if the file extension is not supported at:
code-dot-org/apps/src/util/moderateImage.ts
Lines 49 to 54 in 382b958
But we are still reporting errors that 'the image format is not supported'.
So this PR converts browser-renderable images to PNG files before saving to project assets and moderating.
Links
Testing story
Tested locally on
aichatlevels with Gemini 2.5 Flash Image model selected. No change for user.Deployment notes
Privacy and security