-
Notifications
You must be signed in to change notification settings - Fork 0
Design: Repo Data Structures III
Eric Hanson edited this page Nov 16, 2024
·
2 revisions
Here is a general overview of the central decision point of pg_delta, its data structures.
Delta needs to represent the following as data structures:
- rows
- ordered in checkout order (reverse delete order)
- unique
- fields
- column/value set for each row
- tracked_rows_added - unique set of row_ids
- stage_rows_to_add - unique set of row_ids
- stage_rows_to_remove - unique set of row_ids
- stage_fields_to_change - unique set of column_name: value pairs associated with a row_id
A conceptual overview of the logical operations that need to occur (quickly):
-
commit(repo)- apply the stage to the previous commit and save the snapshot to the db
- row_ids = (parent_commit.rows + stage_rows_to_add) - stage_rows_to_remove
- fields (row_id, column_name, value) = (parent_commit.fields + stage_rows_to_add.fields) - stage_rows_to_remove.fields
-
checkout(commit)- upsert commit rows join commit fields into live db
track_untracked_rows(repo, relation_id)stage_tracked_rows(repo)stage_removed_rows(repo)stage_updated_fields(repo)stage_changed_fields(repo)
- hyper-normalized - tables/rows for every row, field in both stage and commit (the old bundle way)
- hyper-jsonb - commit has only a jsonb manifest which contains everything in the commit. repository has stage and track jsonb objects.
- array
- hstore
- others?