A Daydream Scope plugin implementing the Context Forcing pipeline for consistent long-context autoregressive video generation.
Based on the paper: Context Forcing: Consistent Autoregressive Video Generation with Long Context (Chen et al., Feb 2026) from TIGER-AI-Lab (University of Waterloo).
Note: Official checkpoints and code have not yet been released by the authors. This plugin implements the Slow-Fast Memory architecture described in the paper on top of Causal Forcing weights as a compatible starting point. It will be updated when official weights become available.
Context Forcing solves the student-teacher mismatch in streaming video generation by training with a long-context teacher that sees the full generation history. The key innovation is a Slow-Fast Memory architecture that replaces the standard FIFO KV cache:
Memory = Sink ∪ SlowMemory ∪ FastMemory
| Component | Size | Role |
|---|---|---|
| Attention Sink | N_s = 3 frames | Initial tokens preserved for attention stability |
| Slow Memory | N_c = 12 frames | Long-term buffer for high-entropy keyframes |
| Fast Memory | N_l = 6 frames | Rolling FIFO queue for immediate local context |
Tokens are promoted from Fast Memory to Slow Memory only when they represent significant temporal changes:
consolidate(x_t) if cos_sim(k_t, k_{t-1}) < τ (τ = 0.95)
This prioritizes novelty over redundancy, enabling consistent generation beyond 20 seconds (vs ~5s for standard Causal Forcing).
All tokens are mapped to positions in [0, N_s + N_c + N_l - 1] regardless of actual generation step, preventing out-of-distribution positional embeddings during long generation.
| Aspect | Causal Forcing | Context Forcing |
|---|---|---|
| Context Length | ~5 seconds | >20 seconds |
| KV Cache | Rolling FIFO | Slow-Fast tri-partite |
| Key Innovation | AR teacher for ODE init | Long-context teacher + memory |
uv add scope-context-forcingOr install from source:
git clone https://github.com/livepeer/scope-context-forcing.git
cd scope-context-forcing
uv sync --group dev
uv pip install -e .| Parameter | Default | Description |
|---|---|---|
sink_size |
3 | Attention Sink size (N_s) |
slow_memory_size |
12 | Slow Memory capacity (N_c) |
fast_memory_size |
6 | Fast Memory capacity (N_l) |
consolidation_threshold |
0.95 | Cosine similarity threshold (τ) |
consolidation_interval |
2 | Consolidation frequency in chunks |
height |
480 | Output height |
width |
832 | Output width |
denoising_steps |
[1000, 750, 500, 250] | Denoising schedule |
uv sync --group dev- Paper: https://arxiv.org/abs/2602.06028
- Project page: https://chenshuo20.github.io/Context_Forcing/
- GitHub: https://github.com/TIGER-AI-Lab/Context-Forcing
- Backbone: Wan2.1-T2V-1.3B
See LICENSE for details.