feat(engine): traces search API — cog_search_traces + GET /v1/traces#17
Merged
chazmaniandinkle merged 2 commits intocogos-dev:mainfrom Apr 21, 2026
Merged
Conversation
… + HTTP Closes Agent F's gap #7 — reframed per Agent Q's Survey Q design (2026-04-21) as trace observability rather than diagnostic logs. The content in `.cog/run/*.jsonl` is client-originated semantic metabolite material (turn_metrics, attention signals, TRM prediction-vs-reality, router traces), not kernel slog text. Agent Q's naming reflects that: traces, not logs. Additive only: - New `QueryTraces(root, TraceQuery) (*TraceQueryResult, error)` in internal/engine/traces_query.go. Per-source scan + normalize, filter, merge with order-by-timestamp desc/asc, limit + truncated flag. Mirrors Agent L's QueryLedger algorithmic shape. - New MCP tool `cog_search_traces` registered in registerTools(). Handler delegates to QueryTraces; shares semantics with the HTTP surface via buildTraceQueryFromInput. - New HTTP route `GET /v1/traces` + handleTraces. `/v1/proprioceptive` stays byte-for-byte identical — web/dashboard.html:1265 and web/canvas.html:1706 still consume `{entries, light_cone}` unchanged. Unified result shape: `{source, timestamp, session_id?, level?, line}` hides per-source schema drift. The normalization table owns translation: turn_metrics → timestamp + session_id attention → occurred_at (no session) proprioceptive → timestamp + event-as-level internal_requests → timestamp (float unix seconds — drifts from spec's 'observed RFC3339'; parseTimestamp accepts both forms) Filters: source, level, session_id, case-insensitive substring (byte check before JSON parse), since/until (RFC3339 or Go duration), limit (default 100, max 1000), order (desc|asc). Diagnostics: sources_checked[] reports per-file {scanned, matched, file_exists}. Missing file is not an error; callers can distinguish "file absent" from "empty match". Out of scope (follow-ups): - Kernel slog stderr (cli.go:117). Couples to Windows service-manager stderr capture (Agent K territory) — gets its own PR. - Live tail of trace files. Belongs on Agent N's ledger-backed event bus, not in a file-watcher bolted to this surface. - Rotation semantics for turn_metrics.jsonl when it crosses 100 MB. Writer-side concern; spec is stable across that boundary. Tests (18 total): - 14 unit tests on QueryTraces: empty workspace, single-source, merged multi-source, session_id filter (incl. graceful 0 from no-session source), since duration + RFC3339, until bound, substring (case-insensitive), level on proprioceptive, truncation, malformed-line skipped, order=asc, bad source rejected, normalization correctness across turn_metrics/attention/internal_requests (including float unix drift), substring-too-long rejected. - HTTP roundtrip + 400-on-unknown-source. - MCP roundtrip with session_id filter. - Regression guard: /v1/proprioceptive top-level shape stays exactly {entries, light_cone} — guards dashboard.html:1265 and canvas.html:1706 from silent breakage. - ParseTraceDurationOrTime: RFC3339, duration, rejection of garbage.
This was referenced Apr 21, 2026
# Conflicts: # internal/engine/mcp_server.go # internal/engine/serve.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Unified traces search surface over the kernel's
.cog/run/*.jsonlstreams — closes Agent F's #7 critical MCP blindspot, reframed per Agent Q's design from "logs with search" to trace observability. (The.cog/run/streams are client-originated semantic metabolites per the kernel's metabolic-cycle framing, not diagnostic text.)What landed
cog_search_tracesMCP tool — unified query over multiple JSONL sources with filters for source, session_id, since/until (duration or RFC3339), level, substring, limit, order.GET /v1/tracesHTTP route — 1:1 with the MCP input via query params; returns{results, count, truncated}.attention.jsonl,turn_metrics.jsonl,internal-requests.jsonl,proprioceptive.jsonl(where present) into unified{source, timestamp, session_id?, level?, line}shape.Real drift surfaced + handled
Agent Q's normalization spec listed
internal-requestsas RFC3339 strings. On the live workspace the file's timestamps are Python-style Unix float seconds (e.g."timestamp": 1769281958.2577438). Without detection this would have produced zero-timestamps for that entire source.parseTimestamp()now branches on the first byte of the raw JSON value ("→ RFC3339, else → float64 unix seconds). No escape to callers; drift commented explicitly insourceSpecs./v1/proprioceptivepreserved byte-for-bytedashboard.html:1265andcanvas.html:1706both consume the exact shape.TestProprioceptiveEndpointUnchangedasserts the top-level keys are exactly{entries, light_cone}with no extras. Any future reshape will break CI.Design judgments beyond spec
limit+1so single-source queries can report truncation correctly (exactlimithides the overflow).resolveSourcesin both HTTP and MCP parsers.internal.logplain-text excluded from v1 per Agent Q §8 Q3 — JSONL-only keeps the normalized shape clean.bufio.Scannerreuses internal buffer betweenScan()calls; retainedTraceResult.Linewithout copy would corrupt earlier entries silently.Test plan
QueryTraces— empty workspace, single-source, multi-source ordering, filters (session_id, since/until, level, source, substring), limit+truncation, malformed-line skip, order=asc, bad source reject, normalization correctness, substring-too-long rejectTestHandleTracesHTTP(shape + 400 bad source),TestToolSearchTracesMCP(filter roundtrip)TestProprioceptiveEndpointUnchangedTestParseTraceDurationOrTimego test ./internal/engine/... -short -count=1 -race→ok 1.674sgo build ./...+go vet ./...silentNot in scope
cog_tail_events/GET /v1/events/stream), not this surface.internal.log(non-JSONL).Design reference
Agent Q's design CogDoc:
~/cog-workspace/.cog/mem/semantic/surveys/2026-04-21-consolidation/agent-Q-logs-search-design.cog.mdPre-existing flakes flagged (not this PR)
On
-count>=2, ledger package has package-level state leaks between iterations inTestCrossSessionChain,TestAppendEventConcurrent,TestAppendEventChainIntegrity(appendMu+lastEventCache). AlsoTestSyncWatcherMarksAlreadyPresentBlobraces the 10 ms poll vs fsnotify. Confirmed present on cleanupstream/main; unrelated to this commit. Will file as a follow-up issue.