Skip to content

Commit c8437c8

Browse files
authored
Update duckdb incremental by unique key materialization strategy (#2426)
1 parent a0e55de commit c8437c8

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

docs/concepts/models/model_kinds.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -208,15 +208,15 @@ The `source` and `target` aliases are required when using the `when_matched` exp
208208
### Materialization strategy
209209
Depending on the target engine, models of the `INCREMENTAL_BY_UNIQUE_KEY` kind are materialized using the following strategies:
210210

211-
| Engine | Strategy |
212-
|------------|---------------------|
213-
| Spark | not supported |
214-
| Databricks | MERGE ON unique key |
215-
| Snowflake | MERGE ON unique key |
216-
| BigQuery | MERGE ON unique key |
217-
| Redshift | MERGE ON unique key |
218-
| Postgres | MERGE ON unique key |
219-
| DuckDB | not supported |
211+
| Engine | Strategy |
212+
|------------|-------------------------------------|
213+
| Spark | not supported |
214+
| Databricks | MERGE ON unique key |
215+
| Snowflake | MERGE ON unique key |
216+
| BigQuery | MERGE ON unique key |
217+
| Redshift | MERGE ON unique key |
218+
| Postgres | MERGE ON unique key |
219+
| DuckDB | DELETE ON matched + INSERT new rows |
220220

221221
## FULL
222222
Models of the `FULL` kind cause the dataset associated with a model to be fully refreshed (rewritten) upon each model evaluation.
@@ -392,7 +392,7 @@ TABLE db.menu_items (
392392

393393
### SCD Type 2 By Column
394394

395-
SCD Type 2 By Column supports sourcing from tables that do not have an "Updated At" timestamp defined in the table.
395+
SCD Type 2 By Column supports sourcing from tables that do not have an "Updated At" timestamp defined in the table.
396396
Instead, it will check the columns defined in the `columns` field to see if their value has changed and if so it will record the `valid_from` time as the execution time when the change was detected.
397397

398398
This example specifies a `SCD_TYPE_2_BY_COLUMN` model kind:
@@ -425,7 +425,7 @@ TABLE db.menu_items (
425425
```
426426

427427
### Change Column Names
428-
SQLMesh will automatically add the `valid_from` and `valid_to` columns to your table.
428+
SQLMesh will automatically add the `valid_from` and `valid_to` columns to your table.
429429
If you would like to specify the names of these columns you can do so by adding the following to your model definition:
430430
```sql linenums="1" hl_lines="5-6"
431431
MODEL (
@@ -470,7 +470,7 @@ When a record is added back, the new record will be inserted into the table with
470470
* SCD_TYPE_2_BY_COLUMN: the `execution_time` when the record was detected again
471471

472472
One way to think about `invalidate_hard_deletes` is that, if enabled, deletes are most accurately tracked in the SCD Type 2 table since it records when the delete occurred.
473-
As a result though, you can have gaps between records if the there is a gap of time between when it was deleted and added back.
473+
As a result though, you can have gaps between records if the there is a gap of time between when it was deleted and added back.
474474
If you would prefer to not have gaps, and a result consider missing records in source as still "valid", then you can set `invalidate_hard_deletes` to `false`.
475475

476476
### Example of SCD Type 2 By Time in Action
@@ -543,9 +543,9 @@ Target table will be updated with the following data:
543543
| 4 | Milkshake | 3.99 | 2020-01-02 00:00:00 | 2020-01-02 00:00:00 | 2020-01-03 00:00:00 |
544544
| 4 | Chocolate Milkshake | 3.99 | 2020-01-03 00:00:00 | 2020-01-03 00:00:00 | NULL |
545545

546-
**Note:** `Cheeseburger` was deleted from `2020-01-02 11:00:00` to `2020-01-03 00:00:00` meaning if you queried the table during that time range then you would not see `Cheeseburger` in the menu.
547-
This is the most accurate representation of the menu based on the source data provided.
548-
If `Cheeseburger` were added back to the menu with it's original updated at timestamp of `2020-01-01 00:00:00` then the `valid_from` timestamp of the new record would have been `2020-01-02 11:00:00` resulting in no period of time where the item was deleted.
546+
**Note:** `Cheeseburger` was deleted from `2020-01-02 11:00:00` to `2020-01-03 00:00:00` meaning if you queried the table during that time range then you would not see `Cheeseburger` in the menu.
547+
This is the most accurate representation of the menu based on the source data provided.
548+
If `Cheeseburger` were added back to the menu with it's original updated at timestamp of `2020-01-01 00:00:00` then the `valid_from` timestamp of the new record would have been `2020-01-02 11:00:00` resulting in no period of time where the item was deleted.
549549
Since in this case the updated at timestamp did not change it is likely the item was removed in error and this again most accurately represents the menu based on the source data.
550550

551551

@@ -621,8 +621,8 @@ Assuming your pipeline ran at `2020-01-03 11:00:00`, Target table will be update
621621
| 4 | Milkshake | 3.99 | 2020-01-02 11:00:00 | 2020-01-03 11:00:00 |
622622
| 4 | Chocolate Milkshake | 3.99 | 2020-01-03 11:00:00 | NULL |
623623

624-
**Note:** `Cheeseburger` was deleted from `2020-01-02 11:00:00` to `2020-01-03 11:00:00` meaning if you queried the table during that time range then you would not see `Cheeseburger` in the menu.
625-
This is the most accurate representation of the menu based on the source data provided.
624+
**Note:** `Cheeseburger` was deleted from `2020-01-02 11:00:00` to `2020-01-03 11:00:00` meaning if you queried the table during that time range then you would not see `Cheeseburger` in the menu.
625+
This is the most accurate representation of the menu based on the source data provided.
626626

627627
### Shared Configuration Options
628628

@@ -651,7 +651,7 @@ This is the most accurate representation of the menu based on the source data pr
651651

652652
#### Querying the current version of a record
653653

654-
Although SCD Type 2 models support history, it is still very easy to query for just the latest version of a record. Simply query the model as you would any other table.
654+
Although SCD Type 2 models support history, it is still very easy to query for just the latest version of a record. Simply query the model as you would any other table.
655655
For example, if you wanted to query the latest version of the `menu_items` table you would simply run:
656656

657657
```sql linenums="1"

0 commit comments

Comments
 (0)