Storage Engine v4 — Stack 2: c1z3 envelope format#869
Open
c1-squire-dev[bot] wants to merge 1 commit into
Open
Storage Engine v4 — Stack 2: c1z3 envelope format#869c1-squire-dev[bot] wants to merge 1 commit into
c1-squire-dev[bot] wants to merge 1 commit into
Conversation
Stack 2 of the storage-engine-v4 PR series. Wires the v3 file
envelope on top of Stack 1's protos and codec layer. All under
`//go:build batonsdkv2`.
Proto at `proto/c1/c1z/v3/manifest.proto`:
* `C1ZManifestV3` — engine identifier, engine_schema_version,
engine_config (Any, typically PebbleEngineConfig), payload
encoding, transitively-closed FileDescriptorSet, record-type
catalog, sync-run summary, tooling metadata.
* `PayloadEncoding` enum (UNSPECIFIED, RAW, ZSTD, ZSTD_TAR).
* `RecordTypeInfo` for per-record-type schema versioning.
* `SyncRunSummary` for cheap tooling enumeration without engine open.
* `PebbleEngineConfig` — pinned format_major_version + cache size hint.
Format package at `pkg/dotc1z/format/v3/`:
* `manifest.go` — Marshal/Unmarshal with empty-engine guard;
BuildDescriptorClosure walks protoregistry.GlobalFiles for
c1.storage.v3 packages and returns the transitive-closure
FileDescriptorSet; VerifyDescriptorClosure detects missing
dependencies at read time (returns ErrManifestIncompleteDescriptors).
* `envelope.go` — WriteEnvelope emits magic + uint32-BE length
prefix + manifest + zstd-tar payload. ReadEnvelope parses each
layer with explicit truncation + manifest-size cap (16 MiB) +
invalid-magic guards. ExtractZstdTar streams the payload tar into
a destination directory with directory-traversal protection via
filepath.IsLocal.
Tests cover: full directory roundtrip (write a synthetic Pebble-shaped
dir, read it back, verify file contents); bad magic; truncated header;
empty-engine refusal on both Marshal and Unmarshal; descriptor closure
completeness verification (missing dep detected; complete closure
accepted; storage.v3 records.proto present in the auto-built closure).
Deferred:
* Full protodesc.ToFileDescriptorProto conversion in
BuildDescriptorClosure (Stack 3) — Stack 2 ships the import-graph-
plus-package-name projection so the closure invariant is testable.
The richer projection lands when the engine actually consumes the
descriptors via dynamicpb at open.
* `cmd/baton-descriptor-closure-test/` standalone CI fixture lands
in Stack 5 alongside the equivalence harness.
Refs: RFC v4 §3.2 (envelope), Appendix B (full envelope spec),
research/import-11.md TestCheckpointRoundtrip for the source-DB-
stays-writable property the save protocol depends on.
5b79a33 to
205e0e4
Compare
bcdae06 to
0665ce2
Compare
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stack 2 of RFC 0004. Stacks on Stack 1.
The v3 c1z file format envelope:
C1Z3\x00 | uint32_be(manifest_len) | manifest_proto | zstd(tar(pebble_dir))
New proto at
proto/c1/c1z/v3/manifest.proto—C1ZManifestV3:engine— name of the engine that produced this c1z ("pebble", "sqlite").engine_schema_version— schema major version (currently 1).engine_config— opaquegoogle.protobuf.Anyfor engine-specific config (e.g.PebbleEngineConfigincludes the pinned format major version).payload_encoding—ZstdTarorUncompressedTar.descriptors—FileDescriptorSetcontaining the closure of types in this manifest (lets a future reader self-describe without our codebase).record_types—[]RecordTypeInfowith the (table_name, full_type_name) of each record family.sync_runs—[]SyncRunSummaryso a reader knows what syncs are inside without opening Pebble.New format/v3 package at
pkg/dotc1z/format/v3/:manifest.go—MarshalManifest/UnmarshalManifest+BuildDescriptorClosure(protoregistry.Files) → *descriptorpb.FileDescriptorSet+VerifyDescriptorClosure.envelope.go—WriteEnvelope(w, manifest, payloadDir)+ReadEnvelope(r) (manifest, payloadReader, error)+ExtractZstdTar(payload, dir)withfilepath.IsLocaltraversal protection. Manifest size capped at 16 MiB.Tests (
envelope_test.go):TestEnvelopeRoundtrip— write synthetic Pebble-shaped dir, read back, extract, verify.TestReadEnvelopeBadMagic— refuses non-C1Z3 inputs.TestReadEnvelopeTruncated— refuses truncated headers.TestVerifyDescriptorClosureCatchesMissing— surfaces incomplete closures.TestBuildDescriptorClosureIncludesStorageV3— verifies the closure walker picks up storage v3 types.Test plan
make lintclean🤖 Generated with Claude Code