Skip to content

fix(sight): decode HPACK Huffman headers#670

Merged
Daydreamer-Li merged 1 commit into
alibaba:mainfrom
jfeng18:fix/http2-huffman
Jun 2, 2026
Merged

fix(sight): decode HPACK Huffman headers#670
Daydreamer-Li merged 1 commit into
alibaba:mainfrom
jfeng18:fix/http2-huffman

Conversation

@jfeng18

@jfeng18 jfeng18 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

What

ParsedHttp2Frame::huffman_decode() previously returned the literal placeholder string <huffman:N bytes>, so any Huffman-encoded HTTP/2 header literal (:path, :status, content-type, request/response header values) came out unreadable. This broke path classification and SSE detection for HTTP/2 LLM-API traffic.

Delegate to the already-imported hpack crate's canonical Huffman decoder (RFC 7541 Appendix B), falling back to a lossy view of the raw bytes on a decode error. Adds RFC 7541 test vectors.

Why a separate PR

Originally bundled into #663 (polish) — that mixed a behavior fix with cleanup, violating "PR scope: one PR = one goal". Splitting it out gives the fix its own review surface and makes back-porting / cherry-picking trivial. Once this PR lands, #663 will rebase onto main and the duplicate commit drops out automatically.

Testing

  • 仅单测 (cargo test --lib parser::http2):

    • test_huffman_decode_rfc7541_vectors — RFC 7541 Appendix B C.4.1 "www.example.com", C.4.2 "no-cache", C.6.1 ":status 302".
    • test_huffman_decode_invalid_falls_back_to_lossy — discriminating signal: assert!(!out.starts_with("<huffman:")) flips red if the fix is reverted.
    • 17 http2 frame tests + full lib (439 tests) all pass.
  • 未 E2E: real HTTP/2 LLM-API traffic through SSL probe → frame parser → aggregator → audit pipeline is not exercised here; deferred to a follow-up real-traffic test on the AgentSight ECS.

Independent of #661#668 except #663, from which this commit was extracted.

ParsedHttp2Frame::huffman_decode() returned "<huffman:N bytes>", so any
Huffman-encoded HTTP/2 header literal (:path, :status, content-type, and
request/response header values) came out unreadable. This broke path
classification and SSE detection for HTTP/2 LLM-API traffic.

Delegate to the already-imported hpack crate's canonical Huffman decoder
(RFC 7541 Appendix B), falling back to a lossy view of the raw bytes on a
decode error. Add RFC 7541 test vectors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@Daydreamer-Li Daydreamer-Li left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — fixes a real bug where Huffman-encoded HTTP/2 headers were unreadable, restoring SSE detection, path classification, and status code parsing; no new dependencies, scoped to the stateless decode path only, with RFC 7541 test coverage.

@Daydreamer-Li Daydreamer-Li merged commit 820dc43 into alibaba:main Jun 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:sight src/agentsight/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants