fix: buffer non-streaming response body for token reconciliation (closes #12) by levleontiev · Pull Request #20 · fairvisor/edge

levleontiev · 2026-03-12T10:58:22Z

Summary

streaming.lua: add _reconcile_non_streaming() that buffers up to 1 MiB of upstream response body, extracts usage.total_tokens via cost_extractor, and calls llm_limiter.reconcile() to refund unused pessimistic TPM/TPD reservations
streaming.lua: extend body_filter() to handle non-streaming path (ctx present but active=false) — buffer chunks, trigger reconcile on EOF
streaming.lua: initialise body_buffer field in init_stream()
streaming_spec.lua: add mock_cjson_safe setup + 4 new Gherkin scenarios (full body, chunked delivery, missing usage field, no-context passthrough)

Test plan

busted spec/unit/streaming_spec.lua — 18 scenarios pass (4 new: F015-NS-1..4)
busted spec/unit/ spec/integration/ — 485 total, 0 failures
E2E: non-streaming LLM request via reverse_proxy mode shows token refund in metrics after response

🤖 Generated with Claude Code

#12) Without a body_filter phase for non-streaming responses, cost_extractor was never called and pessimistic TPM/TPD reservations were never refunded, causing token budgets to drain faster than actual usage. Changes: - streaming.lua: import cost_extractor; add _reconcile_non_streaming() that buffers up to 1 MiB of response body per spec, extracts usage.total_tokens, and calls llm_limiter.reconcile() to refund unused tokens - streaming.lua: body_filter() now handles the non-streaming path (active=false but key present) by buffering chunks and reconciling on EOF - streaming.lua: init_stream() initialises body_buffer field - streaming_spec.lua: install mock_cjson_safe; add 4 new scenarios covering full-body extraction, chunked delivery, missing usage field (no over-refund), and no-context passthrough Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add mock LLM backend (nginx serving fixed usage JSON), a new edge service in reverse_proxy mode with token_bucket_llm policy, and three e2e scenarios: - pass-through: response body contains usage.total_tokens=50 - metric: fairvisor_token_reservation_unused_total emitted after reconcile - refund: 5 consecutive requests succeed (reconciled ~50 tokens each vs pessimistic 1000, so budget is not exhausted) Test runs as part of the nightly full e2e suite (pytest tests/e2e). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

tests/e2e/conftest.py


+
+@pytest.fixture(scope="session")
+def edge_llm_reconcile_base_url():


…ng e2e 1. bundle_loader: validate_config was called on a shallow copy of algorithm_config, so computed fields (_tpm_bucket_config, default defaults etc.) were never written back to rule.algorithm_config. After validation, merge all non-algorithm keys back so request-time code sees the normalised config. 2. rule_engine: the final "allow" decision built at line 882 omitted rule_name, and limit_result.key was never set for token_bucket_llm. decision_api.lua:933 fell through to the fallback which concatenated nil rule_name, causing a 500. Fix: track last_allow_rule_name, set limit_result.key = counter_key for allowed LLM checks, and include rule_name in the final allow decision. These bugs were latent and exposed by the new e2e test suite added for issue #12 / Feature 015. All 3 e2e reconciliation tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

oai-codex and others added 2 commits March 12, 2026 10:58

github-code-quality bot found potential problems Mar 12, 2026

View reviewed changes

tests/e2e/conftest.py

@pytest.fixture(scope="session")

def edge_llm_reconcile_base_url():

levleontiev merged commit 7fe8163 into main Mar 12, 2026
7 checks passed

levleontiev deleted the fix/issue-12-non-streaming-token-reconciliation branch March 12, 2026 12:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: buffer non-streaming response body for token reconciliation (closes #12)#20

fix: buffer non-streaming response body for token reconciliation (closes #12)#20
levleontiev merged 3 commits intomainfrom
fix/issue-12-non-streaming-token-reconciliation

levleontiev commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		@pytest.fixture(scope="session")
		def edge_llm_reconcile_base_url():

Conversation

levleontiev commented Mar 12, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants