Skip to content

eval: add TraceMode to omit the value Trace on export#659

Open
borisschlosser wants to merge 7 commits into
mainfrom
boris/trace-mode-and-single-pass-unmarshal
Open

eval: add TraceMode to omit the value Trace on export#659
borisschlosser wants to merge 7 commits into
mainfrom
boris/trace-mode-and-single-pass-unmarshal

Conversation

@borisschlosser

@borisschlosser borisschlosser commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Object merges set Trace.Base on every produced value, and serialization recurses through the whole chain. A leaf at the end of a D-deep import-merge chain therefore carried ~D partial copies of its ancestors on the wire, so opened payloads grew super-linearly in import depth for the same logical content. Combined with a Value.UnmarshalJSON whose cost was O(size × depth), opened-payload decode came to dominate CPU on import-heavy environments (a large fraction of esc.(*Value).UnmarshalJSON on a production profile).

This PR lets callers drop the Trace on paths that never read it, so those payloads stay bounded by their logical content.

What changed

eval.TraceMode + EvalOptions

Two modes, with the historical behavior as the default:

  • TraceModeFull — the zero value. Builds the entire Trace (Def + the full Base chain), so EvalOptions{} is byte-for-byte the prior behavior and no consumer of Trace regresses without opting in.
  • TraceModeNone — exports each value with a zero Trace (no Def, no Base chain). Bounds opened-payload size by logical content instead of import-merge depth.

The Trace is built directly during export(*value).export takes the TraceMode (sourced from the evalContext, which is set once per evaluation and shared with imports, so export memoization stays consistent). Under TraceModeNone the dropped Trace is never allocated; there is no build-then-strip post-processing pass.

The three eval entry points (EvalEnvironment, CheckEnvironment, RotateEnvironment) gain a final EvalOptions argument.

`Value.UnmarshalJSON` json.Unmarshal'd the input twice per nesting level —
once into a RawMessage and once over the captured "value" subtree — so cost
scaled as O(size × depth). The hot path now reuses one json.Decoder for the
entire subtree, including Trace.Base, decoding each byte exactly once per
level.

Object merges set Trace.Base on every produced value, and serialization
recurses through it. As import-merge depth grew, opened payloads serialized
many partial copies of the same logical content. New eval.TraceMode +
EvalOptions arguments control this:

  - TraceModeCollapsed (default): keep the immediate parent only.
  - TraceModeFull: preserve the entire chain.
  - TraceModeNone: drop Trace.Base entirely.

esc.Trace is unchanged (Base stays omitempty). The three eval entry points
gain a final EvalOptions parameter; the zero value selects the safe default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@borisschlosser borisschlosser force-pushed the boris/trace-mode-and-single-pass-unmarshal branch from 3b565af to 7c0428b Compare May 28, 2026 13:02
@borisschlosser borisschlosser marked this pull request as ready for review May 28, 2026 16:05
@borisschlosser borisschlosser requested review from nyobe and pgavlin May 28, 2026 16:05
@tehsis tehsis requested review from a team, Copilot and pulumi-eon June 8, 2026 13:43
@eon-pulumi

This comment was marked as outdated.

This comment was marked as outdated.

Comment thread value.go Outdated
Comment thread value.go Outdated
Comment thread eval/eval_options_test.go Outdated
Comment thread value_test.go Outdated
eon-pulumi[bot]

This comment was marked as outdated.

borisschlosser and others added 2 commits June 11, 2026 11:47
The single-pass decoder diverged from the previous json.Unmarshal path in
two edge cases. Restore exact behavioral parity:

- (*Trace).decodeFrom now accepts a JSON null for "trace", leaving the
  Trace at its zero value instead of erroring. The old struct-based path
  silently zeroed any field on null; forward-compatible payloads that null
  out unknown fields depend on this.
- decodeValueField initializes the array accumulator as []Value{} so an
  empty array stays a non-nil slice. json.Marshal renders a nil []Value as
  null and a non-nil empty one as [], so without this a decode->encode
  round trip silently rewrote "value":[] to "value":null.

Add table cases plus a round-trip test pinning both behaviors, and fix the
chainValue test helper to use fmt.Sprintf so labels stay correct past
depth 9 (string(rune('0'+i)) produced non-digit characters).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
applyTraceMode early-returns on TraceModeFull and otherwise switched on
None/Collapsed with no default, so an out-of-range TraceMode value stripped
nothing and recursed — silently producing the most expensive behavior (full
chains), the opposite of the documented safe default. TraceMode is public
API, so make Collapsed the switch default: any unrecognized mode now degrades
to the bounded default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@borisschlosser

This comment was marked as resolved.

@eon-pulumi

This comment was marked as outdated.

@borisschlosser

This comment was marked as resolved.

@eon-pulumi

This comment was marked as outdated.

eon-pulumi[bot]

This comment was marked as outdated.

@pgavlin

pgavlin commented Jun 16, 2026

Copy link
Copy Markdown
Member

Having a TraceMode seems great--in the majority of cases we probably don't need the full trace. Some high-level feedback:

  • I don't think we need the Collapsed variant. IMO the value in the trace is being able to walk the complete def chain--walking one level doesn't really help much.
  • We should not change the defaults in the public API. The default should remain Full.
  • We should not implement by post-processing traces: we should just change (*value).export to not build the trace in the first place.
  • We should not roll our own JSON decoder for the value struct. Instead, I think we should add another struct for decoding that hand-rolls only the bits necessary to handle the polymorphic value property. We should also consider using segmentio's JSON decoder, which can do zero-copy decoding from RawMessages. If we want to push performance as far as we can, we might want to investigate easyjson, which does ahead-of-time code generation for JSON serde.

@pgavlin

pgavlin commented Jun 16, 2026

Copy link
Copy Markdown
Member

Also--wondering if we have profile data backing up these changes? We should take them regardless but I'm curious to know what else is hot

…ecoder

Reworks the trace-mode change per PR review and the May 2026 CPU root-cause
findings:

- Drop TraceModeCollapsed. The only consumer of Trace.Base is `esc env get`,
  which walks the *entire* chain (Check path); a single-level collapse would
  feed it a silently-truncated provenance stack while the service (Open path)
  reads Trace.Base not at all. So the useful axis is Full vs None, with no
  middle ground.

- Make TraceModeFull the zero value, restoring it as the default. Callers
  passing EvalOptions{} keep the historical full-chain behavior, so no
  Trace.Base consumer regresses without opting in. The read-only Open path
  opts into TraceModeNone in the service.

- Build the chain lazily in (*value).export via an includeBase flag threaded
  off the evalContext, instead of building the full chain and stripping it in
  a post-evaluation pass. Under TraceModeNone the dropped chain is never
  allocated. Deletes applyTraceMode / applyTraceModeToEnvironment.

- Replace the hand-rolled streaming Value decoder with a struct-based one that
  hand-rolls only the polymorphic "value" field; secret/unknown/trace (and
  Trace.Base recursion) go back through the standard decoder. Once TraceModeNone
  removes the deep merge chain from Open-path payloads, the import-depth factor
  that drove the O(size × depth) decode cost is gone, so the simpler decoder is
  sufficient.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@borisschlosser

This comment was marked as resolved.

@eon-pulumi

This comment was marked as outdated.

@borisschlosser

This comment was marked as outdated.

@borisschlosser borisschlosser changed the title eval: bound serialized Trace.Base and decode Value in a single pass eval: add TraceMode to omit the Trace.Base merge-history chain on export Jun 17, 2026
Comment thread eval/value.go Outdated
Comment thread value.go Outdated
borisschlosser and others added 2 commits June 19, 2026 13:49
The decode-CPU cost identified in profiling was driven by deep Trace.Base
merge-history chains in opened payloads (the O(size × depth) factor), not by
the decoder itself. TraceModeNone removes those chains at the source, so the
original stdlib-based Value.UnmarshalJSON is sufficient and there is no need to
hand-roll any JSON parsing.

Reverts value.go to its original decoder (now byte-identical to main). The
added UnmarshalJSON characterization tests and depth benchmark are kept as
regression guards for the existing decoder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pass the TraceMode enum to (*value).export instead of a derived bool, and gate
the entire esc.Trace on it: TraceModeNone now omits Def as well as the Base
chain, so a None-exported value carries a zero Trace and the dropped data is
never built. Removes the includeTraceBase() helper; export call sites pass the
evalContext's traceMode directly.

Also trims the new/changed comments across the change to focus on why.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@borisschlosser borisschlosser changed the title eval: add TraceMode to omit the Trace.Base merge-history chain on export eval: add TraceMode to omit the value Trace on export Jun 19, 2026
These exercised Value.UnmarshalJSON, which this PR no longer changes (the
decoder revert left value.go identical to main). Removing them keeps the PR
scoped to the TraceMode change; the decoder coverage can land on its own.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@borisschlosser

Copy link
Copy Markdown
Contributor Author

@pgavlin this PR now focuses on not calculating traces when TraceModeNone is set. The majority of callers are env open calls which never read or use the traces at all. So these callers can now opt in to not calculating traces at all which improves overall system performance significantly. Other callers continue to calculate and report traces (TraceModeFull).

If we want to further improve performance during trace calculation we can switch the json library for improved processing. But we should wait until the TraceMode changes land and profile the code again.

Please find an AI research below:

JSON library options for esc.(*Value).UnmarshalJSON — follow-up comparison

The intrinsic cost the stdlib decoder leaves on the table is the double-scan of the value field (RawMessage capture + typed re-unmarshal) plus reflection overhead. The awkward bit is the polymorphic value any field — anything we adopt has to handle null/bool/number/string/[]Value/map[string]Value dispatch.

segmentio/encoding mailru/easyjson encoding/json/v2 (go-json-experiment)
Mechanism Drop-in reflective decoder, asm-accelerated scanner AOT codegen per type Streaming decoder with a shared-decoder hook
Fixes the double-scan? No — faster scanner, but our UnmarshalJSON still captures+re-parses Yes — single-pass via jlexer YesUnmarshalJSONFrom(*jsontext.Decoder) decodes the whole tree in one pass
Polymorphic value field Same custom UnmarshalJSON as today Hand-write a jlexer decoder for that field Dispatch via dec.PeekKind(), recurse with json.UnmarshalDecode(dec, &child)
Hand-rolling parsing? No Yes (jlexer) — the thing pgavlin objected to, in a lib's API No — the streaming primitive is library-provided
Dependency status Already an indirect dep; maintenance-mode Stable but low-maintenance; adds codegen step + generated files Pre-stable API; importable as a normal module today; lands in stdlib on a future Go
Behavioral parity risk Medium — verify json.Number/escaping/unknown-fields parity Low (you control generated code) Medium — v2 defaults are stricter (case-sensitive, dup-key rejection); need opts for v1 behavior
Effort Low (swap import at the decode site) Medium-high (codegen infra + jlexer hand-code) Medium (implement UnmarshalJSONFrom)
Expected win Constant-factor (~1.5–2× scan), double-scan remains Largest raw speed Large; eliminates double-scan + per-level decoder allocation

Ranking for this codebase

  1. json/v2 — best strategic fit. Its UnmarshalJSONFrom(*jsontext.Decoder) is exactly the primitive our problem needs: single-pass, shared-decoder behavior as a clean library API — so it satisfies "no hand-rolling" and removes the double-scan. It's also the future stdlib direction. Sketch:

    func (v *Value) UnmarshalJSONFrom(dec *jsontext.Decoder) error {
        // dec is shared across the whole subtree: one pass, no per-level decoder alloc.
        // dispatch the "value" field via dec.PeekKind(); recurse children with
        // json.UnmarshalDecode(dec, &child).
    }

    Downside: pre-stable API churn, and a v1↔v2 behavioral-parity review.

  2. segmentio — best low-risk quick win. Already an indirect dep, drop-in API, swap is a few lines. But it doesn't fix the double-scan structurally — it just makes each scan faster. Good if a profile shows a modest gap and you want minimal change.

  3. easyjson — fastest raw, worst ergonomics. Top speed and true single-pass, but the polymorphic value field forces a hand-written jlexer decoder (more hand-rolling than what was just rejected) plus a codegen step and generated files in the tree. Hard to justify here unless decode is dominant and the other two fall short.

@borisschlosser borisschlosser requested a review from pgavlin June 19, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants