ZBBS-WORK-404: 429 RATE_LIMITED for rate-limited wait-calls#235
Merged
Conversation
…nstead of 200 reply=null When a VA is in rate-limit cooldown, handleDirectChat resolved the wait=true promise with null - the salem engine got a 200 with reply=null, which it can only classify as a malformed response. Rate limiting masqueraded as model-output failure in tick telemetry (the misattribution behind reactor-liveness finding #13 / ZBBS-HOME-332). The rate-limited branch now throws a typed error (statusCode 429, code RATE_LIMITED, resumesInSeconds from the limiter cooldown); the /chat/send wait-mode catch allowlists exactly that shape through as HTTP 429 with error.resumes_in_seconds. Everything else keeps the 502 REPLY_FAILED contract. The breadcrumb chat row still lands before the throw; non-wait dispatch swallows the rejection as before. Engine counterpart (new ErrorRateLimited class) ships separately in the salem repo; old engine binaries map 429 to malformed, identical to today, so deploy order is safe either way (api first preferred). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
jeffdafoe
added a commit
that referenced
this pull request
Jun 16, 2026
…nstead of 200 reply=null (#235) When a VA is in rate-limit cooldown, handleDirectChat resolved the wait=true promise with null - the salem engine got a 200 with reply=null, which it can only classify as a malformed response. Rate limiting masqueraded as model-output failure in tick telemetry (the misattribution behind reactor-liveness finding #13 / ZBBS-HOME-332). The rate-limited branch now throws a typed error (statusCode 429, code RATE_LIMITED, resumesInSeconds from the limiter cooldown); the /chat/send wait-mode catch allowlists exactly that shape through as HTTP 429 with error.resumes_in_seconds. Everything else keeps the 502 REPLY_FAILED contract. The breadcrumb chat row still lands before the throw; non-wait dispatch swallows the rejection as before. Engine counterpart (new ErrorRateLimited class) ships separately in the salem repo; old engine binaries map 429 to malformed, identical to today, so deploy order is safe either way (api first preferred). Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a VA is in rate-limit cooldown, the
/chat/sendwait=true path resolved with a 200 andreply: null— the salem engine can only classify that as a malformed response, so cooldown windows masqueraded as model-output failures in tick telemetry (the misattribution behind reactor-liveness finding #13 / ZBBS-HOME-332).handleDirectChatnow throws a typed error (statusCode: 429,code: 'RATE_LIMITED',resumesInSecondsfrom the limiter cooldown) instead of resolving null. The breadcrumb[Error] Rate limitedchat row still lands before the throw.routes/chat.jsallowlists exactly that error shape through as HTTP 429 witherror.resumes_in_seconds; everything else keeps the 502 REPLY_FAILED contract (allowlist per code_review round 1 — no blockers).rateLimitResumeSeconds()helper reads the limiter's_cooldownUntil.Engine counterpart (new
ErrorRateLimitedclass) ships in the salem repo. Old engine binaries map 429 → malformed, identical to today's label, so the deploy window is safe; deploy this first.— Work
🤖 Generated with Claude Code