Skip to content

refactor(object_store): PayloadView end-to-end; cross-store flushTo; atomic target swap#105

Merged
facontidavide merged 2 commits into
mainfrom
feat/object-store-payloadview
May 29, 2026
Merged

refactor(object_store): PayloadView end-to-end; cross-store flushTo; atomic target swap#105
facontidavide merged 2 commits into
mainfrom
feat/object-store-payloadview

Conversation

@pabloinigoblasco
Copy link
Copy Markdown
Collaborator

Summary

A consolidated refactor of the ObjectStore / DataEngine surface that:

  • Makes sdk::PayloadView (Span<const uint8_t> + BufferAnchor) the universal vocabulary for ownership of bytes across the SDK.
  • Removes the hidden static_pointer_cast<const vector<uint8_t>> trap in resolveEntry — any producer can now anchor on any shared_ptr<T> (arrow buffers, mmap pools, foreign ctx) and the type erasure travels intact to consumers.
  • Adds a zero-copy cross-instance transfer primitive (flushTo) for both ObjectStore and DataEngine. Pointer-move of internal deques; entries cross the boundary without copy or closure invocation.
  • Exposes an atomic target-swap primitive on plugin_data_host write hosts (setTarget) so a host can redirect writes between two stores at runtime safely under concurrent ingest.

Contents

ObjectEntry / ResolvedObjectEntry

  • ObjectEntry::payload is std::any, holding either:
    • std::shared_ptr<const std::vector<uint8_t>> — eager owned, counted against the retention budget.
    • std::function<sdk::PayloadView()> — lazy resolver returning Span + BufferAnchor; not counted (bytes are owned upstream).
  • resolveEntry dispatches via std::any_cast on each alternative explicitly. Each branch handles its own concrete type — no static_pointer_cast on the lazy anchor. The BufferAnchor (= shared_ptr<const void>) stays opaque end-to-end.
  • ResolvedObjectEntry::payload is sdk::PayloadView. The previous data field (shared_ptr<const vector<uint8_t>>) is removed — it locked consumers to a concrete anchor type and required the cast that would fail silently for non-vector anchors.

Cross-store flush

  • ObjectStore::flushTo(ObjectStore& dst) and DataEngine::flushTo(DataEngine& dst): two-phase, atomic, zero-copy bulk transfer of entries / chunks between two instances. Topics matched by descriptor (dataset_id + name); monotonicity enforced strictly per series/topic; failure leaves both sides untouched. Each entry/chunk is moved by value — the std::any (or deque<TopicChunk>) transfers its buffer intact, lazy closures preserved.

Atomic target swap

  • DatastoreSourceWriteHost::setTarget, DatastoreParserWriteHost::setTarget, and the object-write equivalents atomically swap which engine/store receives subsequent writes. Safe under concurrent ingest: in-flight writes complete on the previous target, future writes go to the new one.

Helper rename

  • sdk::makeOwnedPayloadView(vector)sdk::makePayloadView(vector).

ObjectBytesBox removed

  • plugin_data_host.cpp previously wrapped the resolved bytes in an ObjectBytesBox for the C-ABI toolbox handle. With PayloadView on ResolvedObjectEntry, the wrapper became redundant; the handle now points at a heap-allocated PayloadView directly.

Version bump

0.4.00.5.0. Source-incompatible:

  • ResolvedObjectEntry::data renamed to payload and changes shape (PayloadView instead of shared_ptr<vector>).
  • sdk::makeOwnedPayloadView renamed to sdk::makePayloadView.

Consumer migration is mechanical:

  • entry->data->{data,size,empty}()entry->payload.bytes.{data,size,empty}()
  • entry->data == nullptrentry->payload.anchor == nullptr
  • Retaining the bytes' shared ownership: entry->data (shared_ptr) → entry->payload.anchor (BufferAnchor = shared_ptr).

Tests

  • pj_datastore/tests/object_store_test.cpp: 45 tests including new flushTo coverage (basic transfer, monotonicity rejection, lazy closure preservation across flush, shared_ptr identity, retention budget after flush).
  • pj_datastore/tests/engine_integration_test.cpp: 21 tests including new DataEngine flushTo coverage.
  • pj_datastore/tests/plugin_data_host_object_test.cpp: existing tests updated for the PayloadView surface.

All green on Linux/gcc under ./build.sh --debug && ./test.sh.

…atomic target swap

A consolidated refactor of the ObjectStore / DataEngine surface that
makes PayloadView the universal vocabulary for ownership of bytes
across the SDK, adds zero-copy cross-instance transfer of entries
between stores, and exposes an atomic target-swap primitive on the
plugin data host used to redirect writes between two stores at runtime.

ObjectStore + ObjectEntry

  - ObjectEntry::payload is std::any holding either
    std::shared_ptr<const std::vector<uint8_t>> (eager owned bytes,
    counted against the retention budget) or
    std::function<sdk::PayloadView()> (lazy resolver returning Span +
    BufferAnchor; not counted, bytes owned upstream).
    resolveEntry dispatches via std::any_cast; each branch handles its
    own concrete type. No static_pointer_cast on the lazy anchor — its
    type erasure (BufferAnchor = shared_ptr<const void>) is preserved
    end-to-end so producers can anchor on any shared_ptr<T>.

  - ResolvedObjectEntry consolidates on PayloadView (Span + anchor).
    Removes the previous shared_ptr<const vector<uint8_t>> field,
    which locked consumers to a concrete anchor type and required a
    hidden static_pointer_cast in resolveEntry that would fail silently
    for non-vector anchors.

  - ObjectStore::flushTo(ObjectStore& dst): two-phase, atomic, zero-
    copy bulk transfer of entries between two instances. Topics matched
    by descriptor; monotonicity enforced strictly per series; failure
    leaves both sides untouched. Each ObjectEntry is moved by value
    via std::move; the std::any inside transfers its buffer intact.

  - DataEngine::flushTo(DataEngine& dst): symmetric primitive for the
    columnar scalar store. Each TopicStorage's sealed-chunk deque is
    moved through to the destination; no chunk constructor invoked.

  - sdk::makePayloadView(std::vector<uint8_t>) replaces
    sdk::makeOwnedPayloadView. Wraps a vector into a shared_ptr that
    serves as both the bytes backing and the BufferAnchor.

  - ObjectBytesBox removed from plugin_data_host. The C-ABI toolbox
    handle becomes a heap-allocated PayloadView directly.

plugin_data_host atomic target swap

  - setTarget on engine and parser write hosts atomically swap which
    engine/store receives subsequent writes. Safe under concurrent
    ingest: in-flight writes complete on the previous target.

Version bump

  - 0.4.0 -> 0.5.0. Source-incompatible: ResolvedObjectEntry::data
    renamed to payload (PayloadView instead of shared_ptr<vector>).
    Consumers migrate from entry->data->{data,size,empty}() to
    entry->payload.bytes.{data,size,empty}(); from entry->data to
    entry->payload.anchor for ownership retention.
@facontidavide facontidavide force-pushed the feat/object-store-payloadview branch from 0d0a763 to fafa624 Compare May 29, 2026 09:14
@facontidavide facontidavide merged commit 1447d4f into main May 29, 2026
4 checks passed
@facontidavide facontidavide deleted the feat/object-store-payloadview branch May 29, 2026 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants