lancedb · prrao87 · May 24, 2026 · May 22, 2026 · May 22, 2026 · May 24, 2026
diff --git a/docs/indexing/fts-index.mdx b/docs/indexing/fts-index.mdx
@@ -34,6 +34,8 @@ Check FTS index status using the API:
     </CodeBlock>
 </CodeGroup>
 
+`wait_for_index(...)` waits until the named FTS index exists and `index_stats(...)` reports `num_unindexed_rows == 0`. It can time out if writes keep adding rows faster than the index catches up. If a table has multiple FTS indexes, specify the target text column when querying instead of relying on implicit selection.
+
 ### Asynchronous API
 
 When using async connections (`connect_async`), use `create_index` with the `FTS` configuration:
@@ -48,6 +50,8 @@ When using async connections (`connect_async`), use `create_index` with the `FTS
 The `create_fts_index` method is not available on `AsyncTable`. Use `create_index` with `FTS` config instead.
 </Note>
 
+In TypeScript, create an FTS index with `table.createIndex("text", { config: lancedb.Index.fts() })` and query it with `table.query().nearestToText(...)`.
+
 ## Configuration Options
 
 ### FTS Parameters

diff --git a/docs/indexing/gpu-indexing.mdx b/docs/indexing/gpu-indexing.mdx
@@ -26,6 +26,8 @@ into a synchronous process by waiting until the index is built.
 
 </Info>
 
+`wait_for_index(...)` waits for the index to exist and for `index_stats(...)` to report `num_unindexed_rows == 0`. It can time out if the table is receiving continuous writes while the build is trying to catch up.
+
 ## Manual GPU indexing in LanceDB OSS
 
 You can use the Python SDK to manually create the `IVF_PQ` index on a GPU. You'll need
@@ -62,4 +64,3 @@ to enable GPU training on your device.
 
 If you encounter the error `AssertionError: Torch not compiled with CUDA enabled`,
 you need to [install PyTorch with CUDA support](https://pytorch.org/get-started/locally/).
-
diff --git a/docs/indexing/index.mdx b/docs/indexing/index.mdx
@@ -39,6 +39,12 @@ LanceDB provides a comprehensive suite of indexing strategies for different data
 TypeScript currently doesn't support `IvfSq` (IVF with Scalar Quantization).
 </Note>
 
+<Info>
+**Operational checks**
+
+For vector indexes, use the same distance metric when creating the index and searching it. After appends or other writes, use `optimize()` to fold new rows into existing indexes, then check `index_stats(...)` or `wait_for_index(...)` if you need to confirm the index has caught up. `wait_for_index(...)` waits until the named indexes exist and report `num_unindexed_rows == 0`; it can time out if writes keep adding unindexed rows.
+</Info>
+
 ### Quantization Types
 
 Vector indexes can use different quantization methods to compress vectors and improve search performance:

diff --git a/docs/indexing/quantization.mdx b/docs/indexing/quantization.mdx
@@ -15,12 +15,15 @@ Use quantization when:
 
 LanceDB currently exposes multiple quantized vector index types, including:
 - `IVF_PQ` -- Inverted File index with Product Quantization (default). See the [vector indexing guide](/indexing/vector-index) for `IVF_PQ` examples.
+- `IVF_SQ` -- Inverted File index with Scalar Quantization. This is available in Python and Rust; TypeScript does not currently expose `IvfSq`.
 - `IVF_RQ` -- Inverted File index with **RaBitQ** quantization (binary, 1 bit per dimension). Requires vector dimensions divisible by `8`. See [below](#rabitq-quantization) for details.
 - `IVF_HNSW_SQ` -- IVF partitions with an **HNSW graph per partition** plus **Scalar Quantization**. Strong recall/latency/size trade-off for most workloads.
 - `IVF_HNSW_PQ` -- IVF partitions with an **HNSW graph per partition** plus **Product Quantization**. Prefer when PQ-level compression matters and you still want HNSW-style in-partition search.
 
 Two axes are being combined here: whether partitions are searched flatly or via an HNSW graph (`IVF_*` vs. `IVF_HNSW_*`), and which quantizer compresses the vectors (`PQ`, `RQ`, or `SQ`). `IVF_PQ` is the default and works well in many cases. For more drastic compression, RaBitQ (`IVF_RQ`) is a reasonable option. For higher recall at low latency, the HNSW-backed variants are usually the right pick. The ["Choose the Right Index"](/indexing/vector-index#choose-the-right-index) table on the vector indexing page is the canonical decision tool.
 
+Use the same distance metric when training the index and running queries against it. For IVF-based indexes, `num_partitions` controls the number of groups and `sample_rate` controls how many training vectors are sampled per partition, so the training sample is roughly `sample_rate * num_partitions`.
+
 ## RaBitQ quantization
 
 RaBitQ is a binary quantization method that represents each normalized embedding using **1 bit per dimension**, plus a couple of small corrective scalars. In practice, a 1,024-dimensional `float32` vector that would normally take 4 KB can be compressed to roughly a few hundred bytes with RaBitQ, while still maintaining reasonable recall.

diff --git a/docs/indexing/reindexing.mdx b/docs/indexing/reindexing.mdx
@@ -23,7 +23,7 @@ Table optimization performs three maintenance operations:
 
 1. **Compaction**: merges small fragments into larger ones to improve read performance
 2. **Pruning/Cleanup**: removes files from versions older than a retention window (7 days by default)
-3. **Index update**: adds newly-ingested data to existing indexes
+3. **Index update**: adds newly-ingested data to existing vector, scalar, and FTS indexes
 
 <CodeGroup>
     <CodeBlock filename="Python" language="Python" icon="python">
@@ -36,7 +36,7 @@ Table optimization performs three maintenance operations:
 LanceDB Enterprise support incremental reindexing through an automated background process. When new data is added to a table, the system automatically triggers a new index build. As the dataset grows, indexes are asynchronously updated in the background.
 
 - While indexes are being rebuilt, queries use brute force methods on unindexed rows, which may temporarily increase latency. To avoid this, set `fast_search=True` to search only indexed data.
-- Use `index_stats()` to view the number of unindexed rows. This will be zero when indexes are fully up-to-date.
+- Use `index_stats()` to view the number of unindexed rows. This will be zero when indexes are fully up-to-date. If you call `wait_for_index(...)`, it polls the same status and can time out while continuous writes keep adding unindexed rows.
 
 The benefit of using LanceDB Enterprise is that it automates the reindexing process
 and operates continuously in the background, minimizing the impact on latency under high loads.
@@ -57,4 +57,3 @@ If you need to reclaim space more aggressively in OSS, use a shorter retention w
     ```
 </CodeGroup>
 
-
diff --git a/docs/indexing/scalar-index.mdx b/docs/indexing/scalar-index.mdx
@@ -57,6 +57,8 @@ If you are using LanceDB Enterprise, the `create_scalar_index` API returns immed
     </CodeBlock>
 </CodeGroup>
 
+`wait_for_index(...)` waits until the named scalar indexes exist and `index_stats(...)` reports `num_unindexed_rows == 0`. If a table is receiving steady writes, that fully indexed state may not stabilize before the timeout.
+
 ### 3. Update the Index
 
 Updating the table data (adding, deleting, or modifying records) requires that you also update the scalar index. This can be done by calling `optimize`, which will trigger an update to the existing scalar index.
@@ -139,4 +141,3 @@ LanceDB supports scalar indexes on UUID columns (stored as `FixedSizeBinary(16)`
     {ScalarIndexUuidUpsert}
     </CodeBlock>
 </CodeGroup>
-
diff --git a/docs/indexing/vector-index.mdx b/docs/indexing/vector-index.mdx
@@ -53,6 +53,8 @@ You can call `create_index()` with different parameters to create a new index --
 Although the `create_index` API returns immediately, the building of the vector index is asynchronous. To wait until all data is fully indexed, you can specify the `wait_timeout` parameter.
 </Note>
 
+Use the same distance metric for index creation and search. Once a vector index exists, queries use the metric stored with that index. If you need to confirm an async build or refresh is finished, `wait_for_index(...)` waits for the named index to exist and for `index_stats(...)` to report `num_unindexed_rows == 0`; it can time out if new writes keep arriving.
+
 ## Choose the Right Index
 
 Use this table as a quick starting point for choosing the right index type and quantization method for your use case:

diff --git a/docs/reranking/custom-reranker.mdx b/docs/reranking/custom-reranker.mdx
@@ -16,14 +16,18 @@ cover, and only override the ones you need. The base class leaves `rerank_vector
 overridden raises `NotImplementedError` rather than silently returning unsorted results. That's a
 useful guard, but worth knowing about before you wire up a query path you didn't plan for.
 
+The Python base class exposes hybrid, vector-only, and FTS-only rerank hooks. TypeScript and Rust
+currently expose the custom reranker interface for hybrid reranking.
+
 ## Interface
 
 The `Reranker` base interface comes with a `merge_results()` method that can be used to combine the
 results of semantic and full-text search. This is a vanilla merging algorithm that simply concatenates
 the results and removes the duplicates without taking the scores into consideration. It only keeps the
 first copy of the row encountered. This works well in cases that don't require the scores of semantic
 and full-text search to combine the results. If you want to use the scores or want to support
-`return_score="all"`, you'll need to implement your own merging algorithm.
+`return_score="all"`, you'll need to implement your own merging algorithm. The base
+`return_score` option accepts only `"relevance"` and `"all"`.
 
 Whichever methods you override, your reranker has one job on the way out: attach a
 `_relevance_score` column with the most relevant rows at the top. LanceDB will reject the result

diff --git a/docs/reranking/eval.mdx b/docs/reranking/eval.mdx
@@ -26,6 +26,11 @@ score-based path most readers encounter first; `LinearCombinationReranker` is an
 score-based strategy you opt into explicitly.
 </Info>
 
+By default, rerankers return `_relevance_score`. Pass `return_score="all"` when a reranker
+supports it, and you also need the original vector or FTS scores for debugging.
+Evaluation code can rely on returned rows being ordered by descending `_relevance_score`. Empty
+reranked result sets still include the `_relevance_score` column.
+
 The hybrid `rerank(...)` method also accepts a `normalize` argument that controls how the raw
 vector and FTS scores are made comparable before reranking:
 

diff --git a/docs/reranking/index.mdx b/docs/reranking/index.mdx
@@ -14,6 +14,9 @@ with models from Cohere, Sentence-Transformers, and more.
 
 To use a reranker, you perform a search and then pass the results to the `rerank()` method.
 
+Note that `CohereReranker()` requires the `cohere` package and either
+`COHERE_API_KEY` in the environment or an `api_key` argument.
+
 <CodeGroup>
 ```python Python icon="python"
 import lancedb
@@ -42,6 +45,17 @@ LanceDB supports several rerankers out of the box. Here are a few examples:
 
 You can find more details about these and other rerankers in the [integrations](/integrations/reranking) section.
 
+Python also includes score-based rerankers such as `RRFReranker`, `LinearCombinationReranker`,
+and `MRRReranker`, plus provider rerankers for OpenAI, Jina, Voyage AI, Answer.AI, and Cohere.
+Provider rerankers usually need the provider package installed and either an API key argument or
+the provider-specific environment variable.
+
+Rerankers add `_relevance_score` and return rows ordered by descending relevance. Python rerankers
+accept `return_score="relevance"` or `return_score="all"` − use `"all"` when you want to keep the
+original vector distance or FTS score columns for debugging. Model-based rerankers read from
+`column="text"` by default, so either return that column in the search results or pass a different
+column.
+
 <Note>
 **SDK coverage differs across languages**
 
@@ -84,4 +98,4 @@ the `deduplicate` flag.
 
 LanceDB also allows you to create custom rerankers by extending the base `Reranker` class. The custom reranker
 should implement the `rerank` method that takes a list of search results and returns a reranked list of
-search results. This is covered in more detail in the [creating custom rerankers](/reranking/custom-reranker/) section.
+search results. This is covered in more detail in the [creating custom rerankers](/reranking/custom-reranker/) section.
diff --git a/docs/search/filtering.mdx b/docs/search/filtering.mdx
@@ -12,6 +12,8 @@ with filtering capabilities even on datasets containing billions of records.
 
 **Pre-filtering** means LanceDB applies the metadata `where(...)` condition before running vector search, so the search only considers rows that already match the filter. **Post-filtering** means LanceDB runs vector search first and only then filters the returned candidates. Pre-filtering is enabled by default. In practice, pre-filtering is better when the filter is part of the result contract; post-filtering can be lower-latency for expensive or non-indexable filters, but it can return fewer than `limit` rows, or even zero, if the nearest neighbors do not pass the filter.
 
+On hybrid queries, the same `where(...)` filter is applied to both the vector and full-text halves of the query. The prefilter or postfilter choice controls whether that happens before each subquery scores candidates or after the subquery top-k is produced.
+
 ## Example: Metadata Filtering
 
 To illustrate filtering capabilities, let's try four data points with combinations of vectors and metadata:
@@ -228,4 +230,4 @@ For a column of type LIST(T), you can use `LABEL_LIST` to create a scalar index.
 
 Both **pre-filtering** and **post-filtering** can yield false positives. For pre-filtering, if the filter is too selective, it might eliminate relevant items that the vector search would have otherwise identified as a good match. In this case, increasing `nprobes` parameter will help reduce such false positives. It is recommended to call `bypass_vector_index()` if you know that the filter is highly selective.
 
-Similarly, a highly selective post-filter can lead to false positives. Increasing both `nprobes` and `refine_factor` can mitigate this issue. When deciding between pre-filtering and post-filtering, pre-filtering is generally the safer choice if you're uncertain.
+Similarly, a highly selective post-filter can lead to false positives. Increasing both `nprobes` and `refine_factor` can mitigate this issue. When deciding between pre-filtering and post-filtering, pre-filtering is generally the safer choice if you're uncertain.
diff --git a/docs/search/full-text-search.mdx b/docs/search/full-text-search.mdx
@@ -130,6 +130,8 @@ If you want to specify which columns to search use `fts_columns="text"`
 LanceDB automatically searches on the existing FTS index if the input to the search is of type `str`. If you provide a vector as input, LanceDB will search the ANN index instead.
 </Note>
 
+If a table has more than one FTS index, specify the indexed text column in the query. In Python you can use `fts_columns` or the query builder's `nearest_to_text(..., columns=...)`; in TypeScript, use `query().nearestToText(..., columns)`. The newer Lance-native FTS does not accept legacy Tantivy-only index parameters.
+
 ### Keeping the index up to date
 
 Rows you add after building an FTS index aren't part of the index until you optimize the table. Until then, queries fall back to a flat scan over the unindexed fragments to keep results complete, which slows them down as the unindexed tail grows. Call `table.optimize()` to fold new rows into the existing index — it's the same operation used for vector indexes:

diff --git a/docs/search/hybrid-search.mdx b/docs/search/hybrid-search.mdx
@@ -244,6 +244,10 @@ text_query = "flower moon"
 
 Hybrid queries inherit the same builder API as vector and FTS queries, so the same knobs for filtering, distance bounds, and row identity apply. These compose with `.rerank(...)` and the explicit `.vector()` / `.text()` form shown above.
 
+<Info>
+Always set `.limit(...)` on production hybrid queries. Without an explicit cap, the query builder does not give you a useful top-k contract to tune, and it may materialize more rows than you intended before reranking.
+</Info>
+
 ### Returning row IDs
 
 Pass `with_row_id(True)` (Python) or `withRowId()` (TypeScript) to include the internal `_rowid` column in the results. This is useful for joining hybrid results back to a primary table, or for deduping across multiple queries:

diff --git a/docs/search/index.mdx b/docs/search/index.mdx
@@ -13,3 +13,10 @@ icon: "list"
 | [Hybrid Search](/search/hybrid-search/) | Combines vector and full-text search with reranking |
 | [Filtering](/search/filtering/) | Filter results based on metadata fields |
 | [SQL Queries](/search/sql/index) | SQL query capabilities for data exploration and analytics |
+
+## Before you search
+
+- Vector search can run without an ANN index as an exhaustive scan. That's useful while prototyping, but build a vector index before relying on low-latency searches over larger tables.
+- Full-text and hybrid text search require an FTS index on the text column you query. If a table has multiple FTS indexes, specify the target column. FTS also supports phrase, boolean, boosted, multi-match, and fuzzy query forms when you need more than plain terms.
+- Multivector search currently uses cosine similarity and accepts either one query vector or a matrix of query vectors; every query vector must match the inner dimension of the multivector column.
+- Set an explicit `.limit(...)` for production queries. Query builders also support controls such as prefilter/postfilter, distance ranges, row-id inclusion, offset pagination, and Arrow/Pandas/list result materialization.
diff --git a/docs/search/multivector-search.mdx b/docs/search/multivector-search.mdx
@@ -19,6 +19,8 @@ Each item in your dataset can have a column containing multiple vectors, which L
 Currently, only the `cosine` metric is supported for multivector search. The vector value type can be `float16`, `float32`, or `float64`.
 </Warning>
 
+Each query vector must match the inner vector dimension in the multivector column. This applies to both single-vector queries and multi-vector query matrices.
+
 ## Computing Similarity
 
 MaxSim (Maximum Similarity) is a key concept in late-interaction models that:

diff --git a/docs/search/optimize-queries.mdx b/docs/search/optimize-queries.mdx
@@ -33,6 +33,11 @@ Executes the query and provides detailed runtime metrics, including:
 
 Together, these tools offer a comprehensive view of query performance, from planning to execution. Use `explain_plan` to verify your query structure and `analyze_plan` to measure and optimize actual performance.
 
+Metadata filters are prefiltered by default, which usually shows the filter pushed into the
+`LanceScan` or index scan. If you set `prefilter=False`, expect a separate `FilterExec` after
+search instead; that can be useful for some expensive filters, but it changes both latency and
+the number of rows available after filtering.
+
 ## Reading the Execution Plan
 
 To demonstrate query performance analysis, we'll use a table containing 1.2M rows sampled from the [Wikipedia dataset](https://huggingface.co/datasets/wikimedia/wikipedia). Initially, the table has no indices, allowing us to observe the impact of optimization.

diff --git a/docs/search/sql/fts-sql.mdx b/docs/search/sql/fts-sql.mdx
@@ -15,6 +15,8 @@ thoroughly and being prepared to update your queries as newer versions of LanceD
 
 LanceDB provides support for full-text search via SQL queries using the `fts()` User-Defined Table Function (UDTF). This allows you to incorporate keyword-based search (based on BM25) in your SQL queries for powerful text retrieval.
 
+The SQL `fts()` table function expects exactly two string literals: the table name and the JSON FTS query. Build the JSON query in your application, pass it as a SQL string literal, and keep filtering, grouping, or joining in the surrounding SQL.
+
 ## Table Setup
 
 First, set up your FlightSQL client connection. See [SQL Queries documentation](/search/sql) for detailed client setup instructions.