pymrsf — Model-Relative Semantic Filtering

Score RAG chunks by information gain — not just relevance.

Vector databases and semantic chunkers retrieve by relevance (cosine similarity). A chunk can be highly relevant yet contain only facts the model already memorized during training — wasted context window. pymrsf uses the model's own predictive surprise to detect which chunks contain genuinely new information.

Novelty: Does the model already know this? (surprise-based)
Relevance: Is this related to the query? (cosine similarity)
Query Ignorance: Does the model even know the answer? (probe-based gate)
Diversity: Does a better chunk already cover this? (dedup post-filter)

Quick install

# Start fast with an API provider (no 4 GB model download)
pip install pymrsf[openai]
export OPENAI_API_KEY='sk-...'

# Or for full features (probing, smart_chunk, round-trip):
pip install pymrsf[local]

All providers require Ollama for embeddings:

ollama pull nomic-embed-text

30-second example — score and filter chunks

from pymrsf import score_chunk, filter_chunks

chunks = [
    "Backpropagation computes gradients using the chain rule.",
    "Neural networks are inspired by the human brain.",
    "The sky is blue because of Rayleigh scattering.",
]

# Score a single chunk
result = score_chunk(chunks[0], query="How does backpropagation work?")
print(result["rag_score"])   # 0–100
print(result["verdict"])     # "excellent" / "good" / "moderate" / "weak" / "skip"

# Filter to only the useful chunks
useful = filter_chunks(chunks, query="How does backpropagation work?", min_rag_score=50)
# useful ≈ ["Backpropagation computes gradients..."]

With async for production pipelines:

import asyncio
from pymrsf import filter_chunks_async

useful = asyncio.run(filter_chunks_async(chunks, query="...", min_rag_score=50))

60-second example — surprise-guided chunking

Instead of splitting at fixed sizes or sentence boundaries, smart_chunk uses the model's surprise signal to find natural knowledge transitions:

from pymrsf import smart_chunk

long_article = """
Quantum computing leverages superposition and entanglement to perform
calculations that would be infeasible for classical computers. Unlike
classical bits, qubits can exist in multiple states simultaneously.
...
Machine learning models learn patterns from data through iterative
optimization of a loss function. Neural networks, in particular,
use backpropagation to adjust millions of parameters.
...
"""

# Chunks split at the boundary between "quantum computing" and "ML" —
# where the model's surprise signal drops after absorbing one topic
pieces = smart_chunk(long_article, min_chunk_len=200, max_chunk_len=800)

Requires the local provider. Falls back to sentence splitting for API providers.

Provider matrix

This is the most important table in this README — it tells you which features work with which provider.

Feature	local	openai	anthropic
RAG scoring	Full (novelty + relevance + ignorance)	Relevance-only	Relevance-only
Knowledge probing	✅ Full	⚠️ Limited	❌
smart_chunk (surprise-guided)	✅ Yes	Fallback to sentence	Fallback to sentence
Delta compression / round-trip	✅ Yes	❌	❌
Model session (KV-cache)	✅ Yes	❌	❌
Async scoring	✅	✅	✅
Score caching	✅	✅	✅

Key takeaway: probing, smart_chunk, and the experimental round-trip storage all require the local provider (pip install pymrsf[local] + a GGUF model). If you only need relevance-based RAG scoring, OpenAI or Anthropic work fine.

Production configuration

import pymrsf

# Enable pymrsf log output (silent by default)
pymrsf.configure_logging("INFO")

# Tweak runtime settings without touching env vars
pymrsf.configure(
    provider="openai",
    embed_timeout=60,
    default_relevance_cutoff=0.4,
)

Environment variables for container/CI environments:

PYMRSF_PROVIDER=openai
OPENAI_API_KEY=sk-...
PYMRSF_ALLOW_PROVIDER_FALLBACK=true   # silently fall back on embed failures
PYMRSF_EMBED_TIMEOUT=30

PYMRSF_ALLOW_PROVIDER_FALLBACK — when true, embed failures log a warning and continue instead of raising. Off by default (fail-fast).
pymrsf.configure_logging("WARNING") — pymrsf ships with a NullHandler so import pymrsf is silent until you opt in.

See ENV_CONFIG.md for all supported variables.

Experimental: MRSF delta-compression storage

The round-trip storage backend stores only "surprise" tokens (40–60% compression) and reconstructs text via O(n) model inference. Import from pymrsf.experimental to signal the research-grade scope:

from pymrsf.experimental import mrsf_write, mrsf_read, save_index

doc = mrsf_write("The Eiffel Tower was built in 1889.")
print(doc["compression"])   # 0.47 — 47% of tokens were predictable

save_index()
results = mrsf_read("famous French landmark", top_k=1)

Full experimental docs →

Score interpretation

Score	Verdict	Suggested action
80–100	excellent	Prioritise
60–79	good	Include
40–59	moderate	Include if space allows
20–39	weak	Skip if better chunks exist
0–19	skip	Model already knows this

Additional documentation

PROVIDER_SUPPORT.md — full capability matrix with programmatic checks
ENV_CONFIG.md — all environment variables
docs/CONCURRENCY.md — threading and process-safety model
CHANGELOG.md — version history

Paper

The technical approach is described in the MRSF paper (link forthcoming). For now, see CHANGELOG.md for the research lineage and the experimental module for the delta-compression implementation.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
benchmarks		benchmarks
canterbury		canterbury
docs		docs
pymrsf		pymrsf
tests		tests
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
BASELINE.md		BASELINE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ENV_CONFIG.md		ENV_CONFIG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
PERFORMANCE.md		PERFORMANCE.md
PROVIDER_SUPPORT.md		PROVIDER_SUPPORT.md
PYPI_PUBLISH_GUIDE.md		PYPI_PUBLISH_GUIDE.md
README.md		README.md
REFACTOR_SUMMARY.md		REFACTOR_SUMMARY.md
SECURITY.md		SECURITY.md
TEST_RESULTS.md		TEST_RESULTS.md
VERIFICATION_REPORT.md		VERIFICATION_REPORT.md
cUserspokammrsfpymrsfcli.py		cUserspokammrsfpymrsfcli.py
cUserspokammrsfpytest.ini		cUserspokammrsfpytest.ini
cUserspokammrsftests__init__.py		cUserspokammrsftests__init__.py
demo_novelty.py		demo_novelty.py
example_openai.py		example_openai.py
example_performance.py		example_performance.py
examples.py		examples.py
mkdocs.yml		mkdocs.yml
mrsf_benchmark.log		mrsf_benchmark.log
mrsf_benchmark_full.py		mrsf_benchmark_full.py
pyproject.toml		pyproject.toml
quickstart.py		quickstart.py
rag_experiment.py		rag_experiment.py
requirements.txt		requirements.txt
setup.py		setup.py
test_providers.py		test_providers.py
verify_reddit_claims.py		verify_reddit_claims.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pymrsf — Model-Relative Semantic Filtering

Quick install

30-second example — score and filter chunks

60-second example — surprise-guided chunking

Provider matrix

Production configuration

Experimental: MRSF delta-compression storage

Score interpretation

Additional documentation

Paper

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pymrsf — Model-Relative Semantic Filtering

Quick install

30-second example — score and filter chunks

60-second example — surprise-guided chunking

Provider matrix

Production configuration

Experimental: MRSF delta-compression storage

Score interpretation

Additional documentation

Paper

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages