Skip to content

feat: Add Recursive Language Model (RLM) tools for large context processing#125

Draft
DaevMithran wants to merge 1 commit intomainfrom
large-context
Draft

feat: Add Recursive Language Model (RLM) tools for large context processing#125
DaevMithran wants to merge 1 commit intomainfrom
large-context

Conversation

@DaevMithran
Copy link
Copy Markdown
Collaborator

@DaevMithran DaevMithran commented Apr 9, 2026

Summary

  • Introduces @ekai/rlm, a standalone package with six tools (rlm_overview, rlm_peek, rlm_grep, rlm_slice, rlm_query, rlm_repl) that let an agent explore, search, and reason over large contexts without flooding the context window
  • Integrates RLM into the contexto plugin — enabled via a single rlmEnabled: true config flag, with automatic activation when user messages exceed 50% of the token budget
  • After processing, synthesized results are ingested into the mindmap for future recall
  • Includes document parsing support for PDF, Excel, and text formats via optional peer dependencies

Motivation

Handling large data inputs exceeding the size of context window, stuffing everything into the prompt either truncates content, degrades quality, or fails entirely. RLM solves this by keeping the full content in an efficient in-memory buffer and exposing it through purpose-built tools. The agent decides what to look at, when, and how deeply resulting in bounded token usage regardless of input size.

Based on the Recursive Language Model paper, adapted from a monolithic engine into a tools-first architecture that composes with OpenClaw's existing agent loop.

What's included

packages/rlm — Standalone reusable package

  • ContextBuffer — in-memory, line-indexed text buffer that all tools delegate to
  • Six tools — overview (structure), peek (browse), grep (search), slice (extract), query (sub-LLM reasoning), repl (sandboxed JS with full access)
  • REPL sandbox — Node.js vm-based execution environment with security constraints (code validation, timeout, iteration limits)
  • Document parsing — PDF (pdf-parse), Excel (xlsx), text/markdown/CSV/JSON (built-in)
  • CompletionProvider interface — provider-agnostic LLM access for sub-queries

packages/contexto — Plugin integration

  • Config: single rlmEnabled: true flag — no model selection needed (uses OpenRouter auto-routing via pi-ai)
  • Detection: automatic activation when user message exceeds 50% of token budget, or explicit invocation
  • Lifecycle: prepareSubagentSpawn maps child sessions to pending contexts; onSubagentEnded ingests results into mindmap
  • Tool registration: all six RLM tools registered lazily with OpenClaw when enabled
  • Pi-ai adapter: bridges CompletionProvider to OpenClaw's built-in LLM abstraction

Architecture

Large Input (PDF, Excel, logs, code, text)
        │
  parseDocument() → plain text
        │
  ContextBuffer (in-memory, line-indexed)
        │
  Agent selects tools iteratively:
    overview → grep → peek/slice → query/repl
        │
  Synthesized answer → mindmap (future recall)

Content never enters the agent's context window directly. Token usage stays bounded regardless of input size.

Test plan

  • tsc --noEmit passes in both packages/rlm and packages/contexto
  • openclaw.plugin.json reflects rlmEnabled boolean config
  • Verify RLM tools activate and produce expected output
  • Verify mindmap ingestion of RLM summaries for future recall

@DaevMithran DaevMithran marked this pull request as draft April 9, 2026 14:24
@DaevMithran DaevMithran changed the title feat: Support large contexts via rlm feat: Add Recursive Language Model (RLM) tools for large context processing Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant