Skip to content

streaming=True in /run_sse returns empty text after AgentTool calls (works withstreaming=False) #3754

@valentinozegna

Description

@valentinozegna

GitHub Issue: Empty Text Response After AgentTool Calls in SSE Streaming Mode

Bug Description

When using the /run_sse endpoint with streaming: true, the final text response after an AgentTool call (e.g., WebSearch wrapped in AgentTool) is empty. The tool executes successfully and returns results, but the agent's synthesized response based on those results is not included in the SSE stream.

This works correctly when using streaming: false - the agent properly synthesizes and returns a final text response after the tool call.

Expected Behavior

After an AgentTool executes (e.g., WebSearch), the agent should synthesize the tool results and return a final text response in the SSE stream.

Actual Behavior

  • Tool calls execute correctly
  • Tool responses are received
  • Final text response is missing/empty
  • The stream ends without the agent's synthesized answer

Reproduction Steps

Server Setup (Missing from original issue)

Here's the minimal server setup using get_fast_api_app():

# server.py
from google.adk.cli.fast_api import get_fast_api_app

app = get_fast_api_app(
    agents_dir="./agents",
    session_service_uri="agentengine://YOUR_AGENT_ENGINE_ID",
    memory_service_uri="agentengine://YOUR_AGENT_ENGINE_ID",
    web=True,
)

# Run with: uvicorn server:app --port 8000

Agent structure (agents/root_agent/agent.py):

from google.adk.agents import LlmAgent
from google.adk.tools.agent_tool import AgentTool
from google.adk.tools import google_search

# Sub-agent for search
search_agent = LlmAgent(
    name="web_search_agent",
    model="gemini-2.5-flash",
    tools=[google_search],
)

# Root agent using AgentTool
root_agent = LlmAgent(
    name="root_agent",
    model="gemini-2.5-pro",
    tools=[AgentTool(agent=search_agent)],
    instruction="You are a helpful assistant. Use web_search_agent to find information.",
)

Complete Reproduction Steps

  1. Set up the server as shown above
  2. Start server: uvicorn server:app --port 8000
  3. Create a session:
curl -s -X POST "http://localhost:8000/apps/root_agent/users/test_user/sessions" \
  -H "Content-Type: application/json" | jq -r '.id'
# Returns: SESSION_ID
  1. Bug case (streaming: true):
curl -s -X POST "http://localhost:8000/run_sse" \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "root_agent",
    "user_id": "test_user",
    "session_id": "SESSION_ID",
    "new_message": {
      "role": "user",
      "parts": [{"text": "Search for the fastest opamp"}]
    },
    "streaming": true
  }'

Result: Tool executes, but final text response is empty/missing.

  1. Working case (streaming: false):
curl -s -X POST "http://localhost:8000/run_sse" \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "root_agent",
    "user_id": "test_user",
    "session_id": "SESSION_ID",
    "new_message": {
      "role": "user",
      "parts": [{"text": "Search for the fastest opamp"}]
    },
    "streaming": false
  }'

Result: Complete response including synthesized answer like "Based on my search, the fastest opamp is..."

Key Observation

The streaming parameter maps to StreamingMode:

  • streaming: falseStreamingMode.NONEWorks
  • streaming: trueStreamingMode.SSEBug: empty response after AgentTool

This is why adk web works - it uses streaming: false by default.

  1. Create an agent with an AgentTool (e.g., WebSearch):
from google.adk.agents import LlmAgent
from google.adk.tools import AgentTool
from google.adk.tools.agent_tool import AgentTool
from google.adk.agents import Agent

# Create a search sub-agent
search_agent = Agent(
    name="web_search_agent",
    model="gemini-2.5-flash",
    tools=[google_search],  # Built-in search tool
)

# Root agent with AgentTool
root_agent = LlmAgent(
    name="root_agent",
    model="gemini-2.5-pro",
    tools=[
        AgentTool(agent=search_agent),
    ],
    instruction="You are a helpful assistant. Use WebSearch to find information.",
)
  1. Start the ADK server:
poetry run uvicorn server:app --port 8000
  1. Create a session:
curl -s -X POST "http://localhost:8000/apps/root_agent/users/test_user/sessions" \
  -H "Content-Type: application/json"
# Returns: {"id": "SESSION_ID", ...}
  1. With streaming: true (BUG):
curl -s -X POST "http://localhost:8000/run_sse" \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "root_agent",
    "user_id": "test_user",
    "session_id": "SESSION_ID",
    "new_message": {
      "role": "user",
      "parts": [{"text": "Find me the fastest opamp on the market"}]
    },
    "streaming": true
  }'

Result: SSE events include:

  • Agent thought (thinking about using WebSearch)
  • Tool call (WebSearch invocation)
  • Tool response (search results)
  • Missing: Final text response synthesizing the results
  1. With streaming: false (WORKS):
curl -s -X POST "http://localhost:8000/run_sse" \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "root_agent",
    "user_id": "test_user",
    "session_id": "SESSION_ID",
    "new_message": {
      "role": "user",
      "parts": [{"text": "Find me the fastest opamp on the market"}]
    },
    "streaming": false
  }'

Result: All events received correctly, including the final text response like:

"Based on my search, the fastest opamp is the TI OPA855 with 8GHz bandwidth..."

Key Observation

The adk web command works correctly because it internally uses streaming: false by default. This is why testing via adk web shows the complete response, but programmatic access with streaming: true fails.

Environment

  • google-adk version: 1.19.0
  • Python version: 3.12
  • Server: get_fast_api_app() with default VertexAiSessionService
  • Model: gemini-2.5-pro (also reproducible with gemini-2.5-flash)
  • Tool: Any AgentTool wrapping a sub-agent (WebSearch, custom agents, etc.)

Workaround

Use streaming: false in API requests. The /run_sse endpoint still returns SSE-formatted events, but with complete (non-partial) events only.

# In client code
body = {
    "streaming": False,  # Workaround for empty response bug
    ...
}

Related Issues

Additional Context

This issue specifically affects the text response after tool execution. The streaming works fine for:

  • Initial agent thoughts
  • Tool call events
  • Tool response events

Only the final synthesized text response is missing when streaming: true.

Impact

This is a blocking issue for any application that:

  1. Uses AgentTool for sub-agent orchestration
  2. Requires real-time streaming for UX
  3. Expects the agent to synthesize results from tool calls

Labels to Add

  • bug
  • streaming
  • AgentTool
  • run_sse

Metadata

Metadata

Assignees

Labels

live[Component] This issue is related to live, voice and video chatneeds review[Status] The PR/issue is awaiting review from the maintainer

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions