Skip to content

Web UI fails to render non-text parts #414

@slilichenko

Description

@slilichenko

ADK web is broken in a number of ways regarding non-text handling. A simple agent to demo the problems:

from google.adk.agents import Agent

root_agent = Agent(
    name="image_display",
    model="gemini-3.1-flash-image-preview",
    description=(
        "Agent who can generate images."
    ),
    instruction=(
        """
        You are an agent who can generate images. 
        """
    )
)

Given a prompt of "Generate an image of a flower and explain what kind of flower it is.", the agent generates these events:

{
  "id": "bfe28ad9-a577-4a28-b3c4-2a378468a8ae",
  "appName": "image_display",
  "userId": "user",
  "state": {},
  "events": [
    {
      "content": {
        "parts": [
          {
            "text": "Generate an image of a flower and explain what kind of flower it is."
          }
        ],
        "role": "user"
      },
      "invocationId": "e-d4bb169c-7ccc-47e6-9ea2-46639c1bb9e9",
      "author": "user",
      "actions": {
        "stateDelta": {},
        "artifactDelta": {},
        "requestedAuthConfigs": {},
        "requestedToolConfirmations": {}
      },
      "id": "2dc12d0d-2b40-4c14-8f87-ac9a424451bc",
      "timestamp": 1774470817.5733912
    },
    {
      "modelVersion": "gemini-3.1-flash-image-preview",
      "content": {
        "parts": [
          {
            "text": "**Visualizing a Floral Subject**\n\nI'm now focused on the initial step: creating a detailed image. I've decided on a rose and am working on generating that visual asset. My next step will be analyzing the image created.\n\n\n",
            "thought": true
          },
          {
            "text": "**Defining the Image's Details**\n\nI'm starting a close-up photograph of a red rose. Next, I plan to identify it and will then describe the flower's color, structure, and associated meanings.\n\n\n",
            "thought": true
          },
          {
            "inlineData": {
              "data": "<image-data>",
              "mimeType": "image/png"
            },
            "thought": true
          },
          {
            "text": "**Refining Visual Accuracy**\n\nI'm focused on confirming the visual fidelity. The rose image is being reviewed for how well it matches the prompt's details: color, clarity, and overall composition. The image's subject and details are under scrutiny.\n\n\n",
            "thought": true
          },
          {
            "text": "**Evaluating Image Compliance**\n\nI've checked the image against the prompt, focusing on key elements like color and detail. I determined that the image meets the user's requirements. I'm preparing the final delivery, which includes both the image and the explanation.\n\n\n",
            "thought": true
          },
          {
            "inlineData": {
              "data": "<image-data>",
              "mimeType": "image/png"
            },
            "thoughtSignature": "<signature>"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "usageMetadata": {
        "candidatesTokenCount": 1120,
        "candidatesTokensDetails": [
          {
            "modality": "IMAGE",
            "tokenCount": 1120
          }
        ],
        "promptTokenCount": 56,
        "promptTokensDetails": [
          {
            "modality": "TEXT",
            "tokenCount": 56
          }
        ],
        "thoughtsTokenCount": 161,
        "totalTokenCount": 1337,
        "trafficType": "ON_DEMAND"
      },
      "avgLogprobs": 1.1017857142857144,
      "invocationId": "e-d4bb169c-7ccc-47e6-9ea2-46639c1bb9e9",
      "author": "image_display",
      "actions": {
        "stateDelta": {},
        "artifactDelta": {},
        "requestedAuthConfigs": {},
        "requestedToolConfirmations": {}
      },
      "id": "091f9644-e3c6-4c17-805e-fccffc259572",
      "timestamp": 1774470817.577141
    }
  ],
  "lastUpdateTime": 1774470817.577141
}

But the display didn't include the image(s):

Image

Cursory review of the code:

  • this line doesn't seem to work - mediaType is not set.
  • fallback button is not functional. OnClick doesn't work. name is not a correct attribute, should be displayName if it's copied as-is from Part.
  • combining multiple parts into a single message results in "data loss" - the display of previous sessions would only display the latest text part. This occurres when you load one of the previous sessions.
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions