feat: add extra columns and on conflict update#4
Merged
Conversation
c8ca964 to
d86c011
Compare
d86c011 to
be37c7e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds two related features to the staging workflow:
Extra columns on the staging table. Declare additional columns that exist only on the staging table — not on the source model — via an
extra_columns:option. Useful for tracking import metadata, priorities, batch identifiers, or processing flags that shouldn't be persisted to the final destination. Each column is specified as either a simple type symbol or a hash with:type,:default, and:nulloptions. Supported types cover the common ActiveRecord set (:string,:text,:integer,:bigint,:float,:decimal,:boolean,:datetime,:date,:time,:binary,:json,:jsonb,:uuid), with per-adapter SQL type mapping for PostgreSQL, MySQL, and SQLite. Extra columns are automatically excluded during transfer to the source table — the transfer strategies (InsertandUpsert) now intersect staging and source column names so any staging-only column is silently skipped, even if added outside this feature.Declarative conflict resolution for staging inserts. A new
insert_on_conflict:option lets you aggregate data into the staging table across multiple inserts with explicit per-column strategies instead of hand-written SQL. Strategies include:greatest,:least,:new,:existing,:sum,:coalesce, and raw SQL passthrough. The logic lives in a newConflictResolverclass that generates adapter-specific clauses:ON CONFLICT … DO UPDATE SET …for PostgreSQL and SQLite,ON DUPLICATE KEY UPDATE …for MySQL. Validation happens at construction time so misconfiguration surfaces early.Together these let callers use the staging table as a scratchpad for multi-source aggregation — dedupe, merge, and enrich records before transfer — without the result columns leaking into the destination schema.
Example
Review Notes
sql_type_forandquote_defaultoverrides because types and boolean literals differ. Changes to the sharedBasedefaults will cascade unless explicitly overridden per adapter.ON DUPLICATE KEY UPDATEfires on any violated unique constraint on the table, not only the ones listed in:target. If a staging table has multiple unique indexes, the update may trigger on conflicts the caller didn't intend. The README calls this out explicitly. PostgreSQL and SQLite use:targetliterally viaON CONFLICT (col, …).staging.column_names & source.column_namesintersection in both transfer strategies protects against any staging-only column — not justextra_columns. This means extra columns don't need special bookkeeping at transfer time, and the feature composes cleanly with future additions.insert_on_conflict:updatehash accepts a raw SQL string as a strategy for cases the built-in strategies don't cover. This is interpolated verbatim — callers must not pass user-controlled input. Intended for developer-authored expressions like"COALESCE(excluded.col, staging.col) * 2".ConflictResolvervalidates options in its constructor, so a malformedinsert_on_conflict:raisesConfigurationErroratStagingTable.stagecall time rather than when the first INSERT runs. Misconfigurations won't silently ship bad SQL.ConflictResolveris covered by a dedicated unit spec that asserts generated SQL shape per adapter via shared examples.extra_columns_spec.rbruns integration tests through the fullStagingTable.stagepath on real databases. MySQL examples are pending in CI environments where MySQL isn't available and run only when the adapter is up.