Better Context > Bigger Models
ContextFlow is an experimental Context Engine for LLM systems (Pre-1.0).
It provides a stateful orchestration layer to cache, rank, compress, and budget context before injecting it into Large Language Models (LLMs).
Designed for:
- Agents (ReAct, LangGraph, CrewAI)
- RAG systems
- Coding assistants
- Long-running LLM loops
Modern LLM applications often fail not because of the model, but because of bad context. Massive contexts result in the "Lost in the Middle" phenomenon where models ignore relevant details.
The Solution: ContextFlow sits between your data sources and your orchestration layer.
It deterministically removes noise, ranks memory by semantic relevance, cryptographically caches static files, and enforces hard tiktoken limits so you use fewer tokens, lower latency (TTFT), and reduce hallucination loops safely.
For a deep dive into how ContextFlow works and how to extend its architecture, explore the documentation:
- Developer Usage & Extension Guide (
docs/USAGE.md) - How to write code using ContextFlow, embed it in Agent loops, and write custom compression layers. - Architectural Analysis (
docs/ANALYSIS.md) - Why Context Engineering matters and our concept evaluation. - Project Architecture (
docs/ARCHITECTURE.md) - The SOLID pipeline design. - Component Pipeline (
docs/PIPELINE.md) - How messages flow through the system. - Interfaces (
docs/INTERFACES.md) - Metrics & Telemetry (
docs/METRICS.md)
API Data Contracts:
7. Message Schema (docs/MESSAGE_SCHEMA.md) - Pydantic definitions and Priority indexing.
8. Tokenization Limits (docs/TOKENIZATION.md) - Explicit Tiktoken constraints and algorithmic slicing.
9. Filtering Modes (docs/MODES.md)
10. Deterministic Compression (docs/COMPRESSION.md)
11. Async Providers (docs/PROVIDER.md)
| Feature | Description |
|---|---|
| Context Sessions | Stateful wrapper for seamless multi-turn Agent memory |
| Context Caching | Cryptographic hashes skip CPU compression on static RAG items |
| Context Ranking | Dynamic scoring algorithms prioritizing TimeDecay over old logs |
| Deterministic Compression | Remove boilerplate safely without expensive LLM distillation latency |
| Token Budgeting | Hard tiktoken limits strictly prioritizing semantic retention |
| Provider Agnostic | Standard Async API adaptable for OpenAI, Claude, or local Ollama |
ContextFlow is explicitly designed to support both public Open Source workflows (PyPI) and private internal Developer Integration (dropping it into your Monorepos).
If you are building an external application, simply install the official package:
pip install contextflowIf you are dropping ContextFlow into an existing secure company Monorepo:
git clone https://github.com/studentleaner/ContextFlow.git
cd ContextFlow
pip install -e .from contextflow.pipeline import ContextPipeline
from contextflow.mode import MinimalMode
from contextflow.compression import StandardCompressor
from contextflow.provider import MockProvider
pipeline = ContextPipeline(
sources=[],
mode=MinimalMode(),
compressor=StandardCompressor(),
provider=MockProvider(),
)
response = pipeline.run(goal="Summarize the system errors from the logs.")
print(response)MIT License