feat(smc): add Structured Memory Compression engine (Phase 1) by pythondatascrape · Pull Request #3 · pythondatascrape/engram

pythondatascrape · 2026-04-09T11:18:57Z

Summary

Adds internal/smc/ package with core SMC engine: category schema, k-parameter controller, conversation matrix, rule-based decomposer, and preference signal types
Integrates SMC compression into the proxy handler as an alternative to the existing sliding-window compressor, enabled via config
Wires SMC configuration through config.yaml and adds --k CLI flag to engram serve

Details

New types (internal/smc/):

CategorySchema — defines named content categories (e.g., facts, preferences, instructions) with per-category k-parameter overrides
KController — manages compression ratio (k) globally and per-category
ConversationMatrix — stores decomposed conversation rows and serializes back to provider.Message format
RuleDecomposer — heuristic keyword-based decomposer that extracts category-relevant content from messages
PreferenceSignal — correction tracking type for future adaptive compression

Integration:

proxy.Handler.EnableSMC() activates matrix-based compression; falls back to sliding-window when SMC is disabled
SMCConfig added to ProxyConfig with sensible defaults (k=0.5, default 3-category schema)
E2E integration test demonstrates ~57% token reduction on multi-turn conversations

16 files changed, 1060 insertions(+), 3 deletions(-)

Test plan

All unit tests pass (internal/smc/, internal/config/, internal/proxy/)
E2E integration test verifies multi-turn compression and custom schemas
Race detector clean (go test -race)
go vet clean
Binary builds successfully

🤖 Generated with Claude Code

Covers three optimizations: periodic codebook re-injection (--window flag), system prompt token accounting in proxy, and enum default omission in SerializeTurn. Includes testing plan and edge cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds background, per-task ownership and dependencies, code examples, acceptance criteria, detailed testing matrix, regression checks, edge case table with rationale, and rollout notes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

8 tasks with TDD steps: enum default parsing, SerializeTurn omission, Definition() markers, proxy token accounting fix, handler windowSize wiring, periodic re-injection gate, test updates, and integration verification. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Parse enum defaults from schema (first value is default) - Omit default enum values in SerializeTurn output - Mark defaults with * in Definition() for LLM decoding - Update tests for new compression behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ctxOrig and ctxComp now include len(req.System)/4, fixing the statusline context chart which showed 1:1 because the system prompt (largest payload component) was excluded from estimates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Codebook definitions are now only injected on turn 0 and every windowSize turns, matching the proxy's compression window. This avoids redundant definitions in every prompt while ensuring the LLM can still decode compressed history after old turns are rolled into [CONTEXT_SUMMARY]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The benchmark was using &testing.T{} to call newTestDeps, which would panic instead of cleanly failing if setup errors occurred. Inline the setup using the benchmark's own *testing.B for proper error reporting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Window size 0 or 1 now returns messages unchanged in Compress(), matching the spec intent of "no optimization, inject every turn." Previously 0 summarized everything and 1 kept only the last message, dropping recent context. Also adds engram serve --window N to override the config file value at the CLI, forwarded through daemonize to the child process. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ession Previously --window defaulted to 0, making it impossible to distinguish "not set" from "explicitly disable compression." Now defaults to -1 (not set); --window 0 correctly sets windowSize=0 which the compressor treats as no-op. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ions Two bugs caused the statusline context column to show wrong values: 1. Savings percentage used K-rounded integers, so any sub-1K compressed value (e.g. 444 → 0K) displayed as 100% savings instead of ~78%. Fixed by computing percentage from raw values. 2. System prompt field was typed as string, but Claude Code sends it as a content-block array. Go silently unmarshaled to empty string, causing systemTokens=0 and wrong session fingerprint (SHA-256 of ""). Fixed with json.RawMessage + extractSystem() handling both formats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Full roadmap design for Structured Matrix Compression based on McKinsey and IBM architecture documents. Four phases: core SMC engine, persistence/learning, edge/security, and federation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

10-task TDD plan for core structured matrix compression engine. Covers category schema, k-parameter, matrix, decomposer, config integration, and proxy handler wiring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ization

… schemas

Copilot

Pull request overview

Introduces a Phase 1 Structured Matrix Compression (SMC) engine and wires it into the proxy/serve path, alongside codebook/token-accounting optimizations to reduce context overhead and improve reporting accuracy.

Changes:

Added internal/smc package (schema, k-controller, conversation matrix, rule-based decomposer, preference signal) with unit + e2e tests.
Integrated SMC into the proxy (optional compression path), improved system-prompt parsing/token accounting, and added SMC wiring via engram serve flags/config.
Optimized codebook serialization by omitting default enum values and marking defaults in definitions; adjusted related tests and statusline percent computation.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
`internal/smc/category.go`	Adds configurable category schema, defaults, and validation.
`internal/smc/category_test.go`	Tests schema defaults/validation.
`internal/smc/k.go`	Adds k-controller for global/per-category k.
`internal/smc/k_test.go`	Tests k-controller behaviors.
`internal/smc/matrix.go`	Implements conversation matrix and serialization to provider messages.
`internal/smc/matrix_test.go`	Tests matrix append/serialization/token counting.
`internal/smc/decompose.go`	Implements Phase 1 rule-based decomposer.
`internal/smc/decompose_test.go`	Tests rule-based decomposition and k impact.
`internal/smc/preference.go`	Adds preference/correction signal type.
`internal/smc/integration_test.go`	Adds end-to-end SMC tests (multi-turn + custom schema).
`internal/proxy/handler.go`	Adds SMC mode, system prompt extraction, and token accounting updates.
`internal/proxy/handler_test.go`	Adds tests for system prompt token accounting, array-form system parsing, and SMC path.
`internal/proxy/compressor.go`	Treats window sizes `<2` as “no compression”.
`internal/proxy/proxy.go`	Adds server helper to enable SMC on the underlying handler.
`cmd/engram/serve.go`	Adds `--window` and `--k`, wires SMC config/enablement into proxy startup.
`internal/server/handler.go`	Adds windowSize and periodic codebook-def injection gating.
`internal/server/handler_test.go`	Updates constructors and adds window-based reinjection tests.
`internal/server/handler_bench_test.go`	Updates benchmark to new handler constructor signature.
`internal/config/config.go`	Adds SMC config under proxy config with defaults.
`internal/config/config_test.go`	Tests SMC config defaults and custom category loading.
`internal/context/codebook.go`	Adds enum defaults, omits default enum fields in serialization, marks defaults in definitions.
`internal/context/codebook_test.go`	Updates/extends tests for new default omission/definition behavior.
`internal/context/response_codebook_test.go`	Updates expectations to reflect default omission behavior.
`internal/context/history_test.go`	Updates history test expectation after default omission changes.
`internal/optimizer/format.go`	Uses shared percent calculation for context saved %.
`internal/optimizer/format_test.go`	Adds test ensuring percent uses raw context values.
`docs/superpowers/specs/*`	Adds design specs for SMC + codebook optimizations.
`docs/superpowers/plans/*`	Adds implementation plans for SMC Phase 1 + codebook optimizations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T11:28:40Z

+	// Decompose all completed exchanges (pairs) except the last pair.
+	alreadyDecomposed := matrix.Len()
+	pairs := len(messages) / 2
+	currentTail := messages[pairs*2:]
+
+	for i := alreadyDecomposed; i < pairs; i++ {
+		userIdx := i * 2
+		assistIdx := i*2 + 1
+		if assistIdx >= len(messages) {
+			break
+		}
+		exchange := smc.Exchange{
+			UserMessage:      messageText(messages[userIdx]),
+			AssistantMessage: messageText(messages[assistIdx]),
+			TurnIndex:        i,
+		}
+		row, err := h.smcDecomposer.Decompose(context.Background(), exchange, h.smcSchema)
+		if err != nil {
+			slog.Warn("smc: decomposition failed, keeping raw", "turn", i, "err", err)
+			continue
+		}
+		matrix.Append(*row)
+	}
+
+	// Build output: matrix history messages + current raw tail
+	matrixMsgs := matrix.Messages()
+	result := make([]AnthropicMessage, 0, len(matrixMsgs)+len(currentTail)+2)
+
+	for _, m := range matrixMsgs {
+		result = append(result, AnthropicMessage{Role: m.Role, Content: m.Content})
+	}
+	if len(result) > 0 {
+		result = append(result, AnthropicMessage{Role: "assistant", Content: "[compressed history above]"})
+	}
+
+	// Append the current raw tail
+	if pairs > 0 && alreadyDecomposed < pairs {
+		lastPairStart := (pairs - 1) * 2
+		result = append(result, messages[lastPairStart:]...)
+	} else {


In smcCompress, the loop decomposes all complete user+assistant pairs (i < pairs), but later the function also appends the last pair raw (messages[lastPairStart:]). This duplicates the last completed exchange (once in the matrix, once raw) on the first call, and makes the “except the last pair” comment inaccurate. Consider decomposing only up to pairs-1 (or otherwise excluding the last complete pair) when you intend to keep that pair raw for recency.

Copilot · 2026-04-09T11:28:40Z

+	matrix := h.getOrCreateMatrix(sessionID)
+
+	// Decompose all completed exchanges (pairs) except the last pair.
+	alreadyDecomposed := matrix.Len()
+	pairs := len(messages) / 2
+	currentTail := messages[pairs*2:]
+
+	for i := alreadyDecomposed; i < pairs; i++ {
+		userIdx := i * 2
+		assistIdx := i*2 + 1
+		if assistIdx >= len(messages) {
+			break
+		}
+		exchange := smc.Exchange{
+			UserMessage:      messageText(messages[userIdx]),
+			AssistantMessage: messageText(messages[assistIdx]),
+			TurnIndex:        i,
+		}
+		row, err := h.smcDecomposer.Decompose(context.Background(), exchange, h.smcSchema)
+		if err != nil {
+			slog.Warn("smc: decomposition failed, keeping raw", "turn", i, "err", err)
+			continue
+		}
+		matrix.Append(*row)
+	}


smcCompress mutates the per-session ConversationMatrix (matrix.Append) without any synchronization. Since ServeHTTP can be called concurrently for the same session ID, this can race on the matrix’s underlying slice and corrupt history. Consider guarding per-session matrix mutation with a mutex (per matrix or held under smcMu) or otherwise ensuring requests for a session are serialized.

Copilot · 2026-04-09T11:28:40Z

+			AssistantMessage: messageText(messages[assistIdx]),
+			TurnIndex:        i,
+		}
+		row, err := h.smcDecomposer.Decompose(context.Background(), exchange, h.smcSchema)


smcCompress calls Decompose with context.Background(), so request cancellation/timeouts from the inbound HTTP request won’t propagate. Passing r.Context() through (e.g., add a ctx parameter to smcCompress) will make future decomposers (LLM-based / slower) behave correctly under client disconnects.

Suggested change

row, err := h.smcDecomposer.Decompose(context.Background(), exchange, h.smcSchema)

row, err := h.smcDecomposer.Decompose(ctx, exchange, h.smcSchema)

Copilot · 2026-04-09T11:28:40Z

+	// SMC fields — when smcEnabled is true, use matrix decomposition instead of
+	// windowed compression.
+	smcEnabled    bool
+	smcSchema     smc.CategorySchema
+	smcK          smc.KController
+	smcDecomposer smc.Decomposer
+	smcMatrices   map[string]*smc.ConversationMatrix // sessionID -> matrix
+	smcMu         sync.Mutex


smcMatrices is an unbounded in-memory map keyed by sessionID with no eviction/cleanup path. In long-running proxy processes (or if session IDs are derived from fingerprints), this can grow without bound and increase memory usage over time. Consider adding a cleanup strategy (TTL/LRU) or tying lifecycle to the existing session stats/retention behavior.

Copilot · 2026-04-09T11:28:41Z

+// EnableSMC activates structured matrix compression, replacing windowed compression.
+func (h *Handler) EnableSMC(schema smc.CategorySchema, k smc.KController) {
+	h.smcEnabled = true
+	h.smcSchema = schema
+	h.smcK = k
+	h.smcDecomposer = smc.NewRuleDecomposer(k)
+	h.smcMatrices = make(map[string]*smc.ConversationMatrix)
+}


EnableSMC mutates multiple Handler fields (smcEnabled/schema/k/decomposer/map) without any locking. If this is ever called after the proxy has started serving, it can race with ServeHTTP and/or drop existing matrices. Consider either (1) making EnableSMC safe for concurrent use (mutex/atomic + preserve existing matrices), or (2) documenting/enforcing that it must be called only during initialization before Start().

Copilot · 2026-04-09T11:28:42Z

-	System   string             `json:"system"`
-	Stream   bool               `json:"stream"`
+	Messages  []AnthropicMessage `json:"messages"`
+	RawSystem json.RawMessage    `json:"system"`


anthropicRequest.RawSystem is a json.RawMessage without omitempty. If the incoming request omits system, RawSystem will remain nil and re-marshaling will emit "system": null, changing the wire payload. Consider adding omitempty (and/or only re-marshaling when you actually modify the request) to preserve the original shape.

Suggested change

RawSystem json.RawMessage `json:"system"`

RawSystem json.RawMessage `json:"system,omitempty"`

Copilot · 2026-04-09T11:28:42Z

+	if kOverride >= 0 {
+		smcCfg.K = kOverride
+	}
+	smcSchema := configToSMCSchema(smcCfg)


SMC schema values are converted from config and used without validation. Since CategorySchema.Validate exists, it would be safer to validate the schema (non-empty categories, no duplicates, no blank names) before enabling SMC and fail fast with a clear error if the config is invalid.

Suggested change

smcSchema := configToSMCSchema(smcCfg)

smcSchema := configToSMCSchema(smcCfg)

if err := smcSchema.Validate(); err != nil {

return fmt.Errorf("invalid proxy SMC schema: %w", err)

}

Copilot · 2026-04-09T11:28:43Z

+func TestHandler_SMCCompression(t *testing.T) {
+	upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.WriteHeader(200)
+		json.NewEncoder(w).Encode(map[string]any{
+			"content": []map[string]string{{"text": "ok", "type": "text"}},
+			"role":    "assistant",
+		})
+	}))
+	defer upstream.Close()
+
+	schema := smc.DefaultSchema()
+	kc := smc.NewKController(0.5, schema)
+	h := NewHandler(10, t.TempDir(), upstream.URL)
+	h.EnableSMC(schema, kc)
+
+	messages := make([]AnthropicMessage, 20)
+	for i := range messages {
+		if i%2 == 0 {
+			messages[i] = AnthropicMessage{Role: "user", Content: fmt.Sprintf("Please update file%d.go to add logging", i)}
+		} else {
+			messages[i] = AnthropicMessage{Role: "assistant", Content: fmt.Sprintf("Updated file%d.go with slog calls", i)}
+		}
+	}
+
+	body, _ := json.Marshal(map[string]any{
+		"messages": messages,
+		"system":   "You are a helpful assistant.",
+	})
+
+	req := httptest.NewRequest(http.MethodPost, "/v1/messages", strings.NewReader(string(body)))
+	req.Header.Set("Content-Type", "application/json")
+	req.Header.Set("x-api-key", "test-key")
+
+	rr := httptest.NewRecorder()
+	h.ServeHTTP(rr, req)
+
+	if rr.Code != 200 {
+		t.Fatalf("expected 200, got %d: %s", rr.Code, rr.Body.String())
+	}
+}


TestHandler_SMCCompression only asserts the handler returns 200, but doesn’t assert that SMC compression actually changed the outbound messages payload (e.g., presence of intent= matrix rows and absence of [CONTEXT_SUMMARY], or that the message count shrinks/grows as expected). Since the upstream server can capture the request body, consider asserting on the received JSON to ensure the SMC path is really exercised.

Copilot · 2026-04-09T11:28:43Z

+// CompressionRatio returns the target compression ratio for a category.
+// Formula: compression_ratio = 1 - (1 - minRatio) * k
+// At k=0: returns 1.0 (maximum compression).
+// At k=1: returns minRatio (minimum compression, maximum fidelity).
+func (kc KController) CompressionRatio(category string, minRatio float64) float64 {
+	k := kc.EffectiveK(category)
+	return 1 - (1-minRatio)*k


KController.CompressionRatio’s formula and doc appear inverted relative to the rest of SMC: RuleDecomposer.compress treats higher k as less compression (keeps more text), but CompressionRatio currently decreases as k increases (k=0→1.0, k=1→minRatio) while the comments call k=0 “maximum compression”. Consider aligning this method with the compress() mapping (e.g., ratio = minRatio + (1-minRatio)*k) or renaming/re-documenting it so k has a consistent meaning across the package.

Suggested change

// CompressionRatio returns the target compression ratio for a category.

// Formula: compression_ratio = 1 - (1 - minRatio) * k

// At k=0: returns 1.0 (maximum compression).

// At k=1: returns minRatio (minimum compression, maximum fidelity).

func (kc KController) CompressionRatio(category string, minRatio float64) float64 {

k := kc.EffectiveK(category)

return 1 - (1-minRatio)*k

// CompressionRatio returns the target retained-text ratio for a category.

// Formula: compression_ratio = minRatio + (1 - minRatio) * k

// At k=0: returns minRatio (maximum compression).

// At k=1: returns 1.0 (minimum compression, maximum fidelity).

func (kc KController) CompressionRatio(category string, minRatio float64) float64 {

k := kc.EffectiveK(category)

return minRatio + (1-minRatio)*k

Copilot · 2026-04-09T11:28:43Z

+	targetRatio := 0.1 + 0.9*k
+	targetLen := int(float64(len([]rune(text))) * targetRatio)
+	if targetLen < 1 {
+		targetLen = 1
+	}
+
+	runes := []rune(text)
+	if len(runes) <= targetLen {
+		return text
+	}
+	return string(runes[:targetLen])


RuleDecomposer.compress converts the input to []rune twice (len([]rune(text)) and then runes := []rune(text)), which doubles allocations for non-empty text. Consider converting once and reusing the slice for both length calculation and truncation.

Erik Meyer and others added 21 commits April 8, 2026 07:10

docs: add SMC architecture design spec

9214762

Full roadmap design for Structured Matrix Compression based on McKinsey and IBM architecture documents. Four phases: core SMC engine, persistence/learning, edge/security, and federation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add SMC Phase 1 implementation plan

0da6433

10-task TDD plan for core structured matrix compression engine. Covers category schema, k-parameter, matrix, decomposer, config integration, and proxy handler wiring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(smc): add category schema types with defaults and validation

6d0ab75

feat(smc): add k-parameter controller with per-category overrides

ca91153

feat(smc): add PreferenceSignal type for correction tracking

119ac07

feat(smc): add ConversationMatrix with row storage and message serial…

89ef2e8

…ization

feat(smc): add Decomposer interface and rule-based implementation

26ed7f1

feat(config): add SMC configuration with default category schema

63e3f4d

feat(proxy): integrate SMC matrix compression into handler

b2ab216

feat(serve): wire SMC config and --k flag through to proxy

f5b46c4

test(smc): add end-to-end integration tests for multi-turn and custom…

951c8b2

… schemas

Copilot AI review requested due to automatic review settings April 9, 2026 11:19

Copilot started reviewing on behalf of pythondatascrape April 9, 2026 11:19 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(smc): add Structured Memory Compression engine (Phase 1)#3

feat(smc): add Structured Memory Compression engine (Phase 1)#3
pythondatascrape wants to merge 21 commits intomainfrom
feature/smc-phase1

pythondatascrape commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	row, err := h.smcDecomposer.Decompose(context.Background(), exchange, h.smcSchema)
	row, err := h.smcDecomposer.Decompose(ctx, exchange, h.smcSchema)

	RawSystem json.RawMessage `json:"system"`
	RawSystem json.RawMessage `json:"system,omitempty"`

Conversation

pythondatascrape commented Apr 9, 2026

Summary

Details

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants