Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/snippets/tables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ export const PyVersioningTags = "# Create a tag pointing at a specific version\n

export const PyVersioningUpdateData = "table.update(where=\"author='Richard'\", values={\"author\": \"Richard Daniel Sanchez\"})\nrows_after_update = table.count_rows(\"author = 'Richard Daniel Sanchez'\")\nprint(f\"Rows updated to Richard Daniel Sanchez: {rows_after_update}\")\n";

export const TsAddProgress = "// Track ingestion progress as batches are written.\nawait table.add(moreData, {\n progress: (p) => {\n const total = p.totalRows ?? \"?\";\n console.log(\n `wrote ${p.outputRows}/${total} rows ` +\n `(${p.outputBytes} bytes, ${p.elapsedSeconds.toFixed(1)}s, ` +\n `${p.activeTasks}/${p.totalTasks} tasks active)` +\n (p.done ? \" \\u2014 done\" : \"\"),\n );\n },\n});\n";

export const TsAddColumnsCalculated = "// Add a discounted price column (10% discount)\nawait schemaAddTable.addColumns([\n {\n name: \"discounted_price\",\n valueSql: \"cast((price * 0.9) as float)\",\n },\n]);\n";

export const TsAddColumnsDefaultValues = "// Add a stock status column with default value\nawait schemaAddTable.addColumns([\n {\n name: \"in_stock\",\n valueSql: \"cast(true as boolean)\",\n },\n]);\n";
Expand Down
32 changes: 32 additions & 0 deletions docs/tables/create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import {
PyCreateTableFromIterator as CreateTableFromIterator,
TsCreateTableFromIterator as TsCreateTableFromIterator,
RsCreateTableFromIterator as RsCreateTableFromIterator,
TsAddProgress,
PyOpenExistingTable as OpenExistingTable,
TsOpenExistingTable as TsOpenExistingTable,
RsOpenExistingTable as RsOpenExistingTable,
Expand Down Expand Up @@ -323,6 +324,37 @@ but not yet exposed in Python or TypeScript
([tracking issue](https://github.com/lancedb/lancedb/issues/3173)).
</Note>

#### Tracking ingestion progress
<Badge color="green">TypeScript Only</Badge>

For long-running writes, pass a `progress` callback to `table.add()` to surface
per-batch progress in your UI, logs, or metrics pipeline. The callback fires
once per batch written and once more with `done: true` when the write completes.

Each invocation receives a `WriteProgress` object:

| Field | Description |
|:------|:------------|
| `outputRows` | Rows written so far. |
| `outputBytes` | Bytes written so far. |
| `totalRows` | Expected total rows when the input source reports one. Always set on the final callback. |
| `elapsedSeconds` | Wall-clock seconds since the write started. |
| `activeTasks` | Parallel write tasks currently in flight. |
| `totalTasks` | Total parallel write tasks (the write parallelism). |
| `done` | `true` only for the final callback. |

<CodeGroup>
<CodeBlock filename="TypeScript" language="TypeScript" icon="square-js">
{TsAddProgress}
</CodeBlock>
</CodeGroup>

A few things to know before you wire this up:

- Back-pressures the writer: callback invocations are serialized and run inline with each batch, so a slow callback will slow the write rather than drop updates. Every batch update is delivered, and the final `done: true` callback always fires (even on error or cancellation). Keep the callback cheap — offload heavy work to a queue you drain elsewhere.
- Errors swallowed: anything your callback throws is logged with `console.warn` and won't abort the write, so keep the callback side-effect-only and don't rely on it for control flow.
- Row totals: `totalRows` is only populated when the input source can report it up front (for example, a materialized `arrow.Table`). For streaming sources it stays `undefined` until the final callback, where it falls back to the actual rows written.

## Create empty table
You can create an empty table for scenarios where you want to add data to the table later.
An example would be when you want to collect data from a stream/external file and then add it to a table in
Expand Down
Loading