Use encoder sliding_window in encode_step KV cache init by artuskg · Pull Request #4 · awni/voxmlx

artuskg · 2026-02-09T18:13:57Z

Bug fix: use encoder `sliding_window` for streaming encoder KV cache

Problem

VoxtralRealtime.encode_step() initialized encoder cache with a large hardcoded size (100000) instead of the model-configured encoder sliding window.

Resolution

Initialize per-layer encoder caches with:
- RotatingKVCache(int(self.encoder.sliding_window))

Why this helps

Restores model/runtime contract.
Prevents avoidable memory and compute growth for long-running streams.

Regression test

tests/test_bugfix_encoder_window_optional.py::test_encode_step_uses_encoder_sliding_window

How to run

VOXMLX_ENABLE_MLX_RUNTIME_TESTS=1 python3 -m unittest -v tests.test_bugfix_encoder_window_optional

…egression test

Two bugs caused the encoder to produce garbage embeddings once audio exceeded the sliding window size: 1. Encoder KV cache was initialized with hardcoded 100_000 instead of the actual encoder.sliding_window (750). This meant the cache never rotated, growing unbounded until memory or attention degraded. 2. RotatingKVCache._update_concat trim calculation didn't account for the size of the appended keys, trimming one entry too few. Fixes awni#7. Based on patches from PRs awni#3 and awni#4. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use encoder sliding_window in encode_step cache initialization with r…

14840a1

…egression test

artuskg mentioned this pull request Feb 9, 2026

Add optional regression tests for 4 confirmed bug behaviors #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use encoder sliding_window in encode_step KV cache init#4

Use encoder sliding_window in encode_step KV cache init#4
artuskg wants to merge 1 commit into
awni:mainfrom
artuskg:codex/awni-fix2-encoder-window

artuskg commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

artuskg commented Feb 9, 2026

Bug fix: use encoder sliding_window for streaming encoder KV cache

Problem

Resolution

Why this helps

Regression test

How to run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug fix: use encoder `sliding_window` for streaming encoder KV cache