bug: intermittent Empty completion with BigModel glm-5-turbo on long Sage runs

## Summary

I hit repeated retries with:

`Retryable(Empty completion received - no content, tool calls, or valid finish reason)`

when using Forge against **BigModel** (`open.bigmodel.cn`) with **`glm-5-turbo`** during a long Sage session.

This seems to happen after a successful upstream connection/response event, but before Forge can build a non-empty final completion.

## Environment

- Forge: local build (`0.1.0-dev`)
- OS: macOS
- Provider: `big_model`
- Model: `glm-5-turbo`
- Endpoint: `https://open.bigmodel.cn/api/paas/v4/chat/completions`

## What happened

During a deep analysis run, previous turns worked, then at `message_count: 82` Forge started retrying the same turn and failed 8 times with the empty completion error.

Relevant log excerpt (sanitized):

```json
{"timestamp":" 297.233908375s","level":"INFO","fields":{"message":"Connecting Upstream","url":"https://open.bigmodel.cn/api/paas/v4/chat/completions","model":"glm-5-turbo","message_count":"82","message_cache_count":"81"}}
{"timestamp":" 300.078292042s","level":"DEBUG","fields":{"message":"Received completion from Upstream"}}
{"timestamp":" 300.078443209s","level":"ERROR","fields":{"message":"Retry attempt due to error","error":"Retryable(Empty completion received - no content, tool calls, or valid finish reason)","model":"glm-5-turbo"}}
```

Then the same pattern repeated multiple times for the same turn.

## Raw SSE capture / replay

I captured/replayed the exact request payload shape (82 messages) via direct `curl` to BigModel and got valid SSE endings in repeated runs:

- some runs ended with `finish_reason: "stop"` + `[DONE]`
- some runs ended with `finish_reason: "tool_calls"` + `[DONE]`
- no raw replay run reproduced an empty terminal completion

Example tail (`stop`):

```text
data: {"choices":[{"index":0,"finish_reason":"stop","delta":{"role":"assistant","content":""}}],"usage":{...}}
data: [DONE]
```

Example tail (`tool_calls`):

```text
data: {"choices":[{"index":0,"finish_reason":"tool_calls","delta":{"role":"assistant","content":""}}],"usage":{...}}
data: [DONE]
```

## Why I think this may be Forge-side (or at least needs better diagnostics)

The error is thrown when the aggregated stream result has:

- empty content
- no tool calls
- no finish reason
- no thought signature

Given the upstream was connected and a completion event was observed, this is hard to debug without raw response chunk logging at failure time.

## Web/docs checks

- No obvious duplicate issue found in this repo for the exact error text.
- Z.ai docs note that abnormal SSE termination should indicate reason via `finish_reason`.

## Request

Could maintainers please:

1. Investigate this edge case in OpenAI-compatible stream aggregation for BigModel/GLM-5, and
2. Add optional debug logging/dump of terminal parsed chunk state when `EmptyCompletion` is raised (to distinguish provider empty stream vs local parse/aggregation gap).

## Artifacts

Sanitized artifacts are in this gist:

https://gist.github.com/chindris-mihai-alexandru/43f711cbb45e214d3307025d25a28425

(contains core Forge log excerpt, SSE tails, replay summary, and duplicate-search notes)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: intermittent Empty completion with BigModel glm-5-turbo on long Sage runs #2874

Summary

Environment

What happened

Raw SSE capture / replay

Why I think this may be Forge-side (or at least needs better diagnostics)

Web/docs checks

Request

Artifacts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: intermittent Empty completion with BigModel glm-5-turbo on long Sage runs #2874

Description

Summary

Environment

What happened

Raw SSE capture / replay

Why I think this may be Forge-side (or at least needs better diagnostics)

Web/docs checks

Request

Artifacts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions