Skip to content

fix(sight): per-call token in audit records#681

Open
jfeng18 wants to merge 1 commit into
alibaba:mainfrom
jfeng18:fix/sight-per-call-token
Open

fix(sight): per-call token in audit records#681
jfeng18 wants to merge 1 commit into
alibaba:mainfrom
jfeng18:fix/sight-per-call-token

Conversation

@jfeng18

@jfeng18 jfeng18 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

What

agentsight audit --type llm --json showed input_tokens=None / output_tokens=None for every LLM call, even though the aggregate (agentsight token) reported the correct total. Users could see "how much did I spend today" but not "which prompt cost the most."

Root causes (two independent gaps)

  1. Token extraction gated on is_sse only — non-streaming LLM responses (stream:false) never had their usage JSON parsed. The fallback in analyze_aggregated() only tried the SSE chunk-array path.

  2. Audit record creation gated on is_sse onlyanalyze_http() returned None for all non-SSE requests, so non-streaming completions never got an audit record at all. Also, is_sse was hardcoded to true in the audit record struct.

Fixes

  • Relax the token extraction fallback to also try parse_data() on single-object response bodies (non-streaming path).
  • Add is_llm_path check that admits known LLM API endpoints (/chat/completions, /v1/messages, /v1/completions, /api/v1/copilot/generate_copilot) even when is_sse=false.
  • Change is_sse: true hardcode to is_sse: http_record.is_sse.

Testing

Verification Status What was checked
已 E2E (ECS, kernel 6.6.102) cosh SSE call to qwen3-coder-plus: audit record now shows in=17074 out=1 model=qwen3-coder-plus is_sse=true (was None/None/None). Aggregate unchanged.
仅单测 (3 new discriminating tests) (1) non-SSE LLM path → audit record with is_sse=false + correct token counts; (2) non-LLM path → no audit record; (3) SSE path → is_sse=true. Reverting either fix makes the corresponding test fail. 442 lib tests, zero regressions.
未 E2E Non-streaming (stream:false) path not exercisable via cosh (always streams) or curl (short-lived process, not traced). Logic correctness covered by unit tests + parse_data() already validated in cache_shadow's saved_tokens fallback.

Independent of #661#668, #676, #680.

Fixes #699

@jfeng18

jfeng18 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

Gentle ping — this fixes #699 (per-call token always null in audit records). 3 discriminating unit tests + ECS E2E verified. Ready for review.

@jfeng18 jfeng18 force-pushed the fix/sight-per-call-token branch 2 times, most recently from 9c2caf8 to 3558624 Compare June 6, 2026 13:22
@jfeng18

jfeng18 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

Per-call token audit (123 lines). Adds per-invocation token tracking to the audit path.

The aggregate token count (agentsight token) was correct, but
per-call audit records (agentsight audit --type llm --json)
showed input_tokens=None / output_tokens=None for every call.

Two root causes:

1. Token extraction was gated on is_sse — non-streaming LLM
   responses never had their usage JSON parsed. Relax the gate
   to also try parse_data() on single-object response bodies.

2. analyze_http() filtered out all non-SSE requests, so non-
   streaming completions never got an audit record at all. Add
   an is_llm_path check that admits known LLM API endpoints
   even when is_sse=false. Also fix a latent bug: is_sse was
   hardcoded to true in the audit record; changed to propagate
   http_record.is_sse.

3 discriminating unit tests added (non-SSE LLM → audit with
correct is_sse=false + tokens; non-LLM path → no audit;
SSE → is_sse=true).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@jfeng18 jfeng18 force-pushed the fix/sight-per-call-token branch from 3558624 to e79ff6e Compare June 10, 2026 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:sight src/agentsight/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[sight] bug: per-call token field is always null in audit records

1 participant