Skip to content

Secondary index doc update#202

Open
ghost wants to merge 6 commits intomasterfrom
Seconday-index-update
Open

Secondary index doc update#202
ghost wants to merge 6 commits intomasterfrom
Seconday-index-update

Conversation

@ghost
Copy link

@ghost ghost commented Jan 20, 2026

  1. Secondary update concept
  2. Secondary update how-to

@ghost ghost self-assigned this Jan 20, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces the previous secondary-index how-to with two new documentation files: a conceptual overview and a practical how-to. All edits are documentation-only; no code, schema, or public API changes.

Changes

Cohort / File(s) Summary
Secondary Index Concepts
docs/defradb/Concepts/secondary-index.md
Added a new conceptual document describing secondary index types, internal structures (including inverted/path-aware indexes for JSON), query execution patterns with/without indexes, performance considerations, maintenance overhead, ordering/direction, benefits, limitations, and usage guidance.
Secondary Index How-To Guide
docs/defradb/How-to Guides/secondary-index-how-to.md
Added a new practical how-to covering the @index directive, directive options (name, unique, direction, includes), composite & relationship indexes, JSON field indexing, CLI lifecycle (create/drop/list), examples, troubleshooting, and optimization tips.
Removed How-To Documentation
docs/defradb/How-to Guides/secondary-index.md
Deleted the previous how-to doc that contained @index examples, configuration options, JSON indexing mechanics, and best-practice guidance (content migrated into the two new files).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • jsimnz
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Secondary index doc update' directly describes the main change: reorganizing and updating secondary index documentation (moving content from How-to Guides to Concepts, removing duplicates, adding a new how-to guide).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docs/defradb/How-to` Guides/seconday-index-how-to.md:
- Around line 1-7: Rename the file named "seconday-index-how-to.md" to
"secondary-index-how-to.md" (fix the missing 'r'), and update any internal
references or links that point to "seconday-index-how-to.md" so they use
"secondary-index-how-to.md" instead; ensure the frontmatter and sidebar_label
remain unchanged and verify that any sidebar or table-of-contents entries
referencing the old filename are updated to the new filename.
- Line 25: The markdown link label "[secondary index concepts]" points to the
wrong path; update the link target from
"/defradb/next/Concepts/secondary-indexes" to the correct file name
"/defradb/next/Concepts/secondary-index.md" so the link resolves without a 404.
- Line 266: Update the link in the "Test query performance" bullet that
currently points to "/defradb/next/How-to%20Guides/explain-systems-how-to" so it
uses the same relative pattern as other How-to guides; replace that absolute
path with "./explain-systems-how-to.md" in the line containing "**Test query
performance**" to match existing guide linking conventions.
🧹 Nitpick comments (2)
docs/defradb/Concepts/secondary-index.md (2)

35-46: Consider adding language specifiers to code blocks for better rendering.

The conceptual examples showing query execution would benefit from language specifiers (e.g., text or plaintext) to improve syntax highlighting and rendering consistency.

💡 Suggested formatting improvement
-```
+```text
 Query: Find users with age = 30
 Process: Scan all user documents → Check each age field → Return matches
 Cost: O(n) where n = total documents

- +text
Query: Find users with age = 30
Process: Look up "30" in age index → Return matching document IDs
Cost: O(log n) for lookup + O(m) for retrieval where m = matching documents


55-60: Add language specifiers to index structure examples.

The code blocks showing index structures and formats lack language specifiers. Adding text or plaintext identifiers would improve rendering and satisfy linting rules.

💡 Example fix for one block
-```
+```text
 Index entries:
 "Alice" → [doc_id_1]
 "Bob" → [doc_id_2, doc_id_3]
 "Charlie" → [doc_id_4]

Apply similar changes to blocks at lines 94-98, 200-204, 213-222, and 272-280.
</details>


Also applies to: 94-98, 200-204, 213-222, 272-280

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 2b21cce44e2ae40d774891e99079c779873d6e3d and c5cd21b8d82847b8188587f2f25cb7306726d188.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `docs/defradb/Concepts/secondary-index.md`
* `docs/defradb/How-to Guides/secondary-index.md`
* `docs/defradb/How-to Guides/seconday-index-how-to.md`

</details>

<details>
<summary>💤 Files with no reviewable changes (1)</summary>

* docs/defradb/How-to Guides/secondary-index.md

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>docs/defradb/Concepts/secondary-index.md</summary>

12-12: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

---

35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

42-42: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

55-55: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

94-94: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

200-200: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

213-213: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

219-219: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

272-272: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

277-277: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

</details>

<sub>✏️ Tip: You can disable this entire section by setting `review_details` to `false` in your review settings.</sub>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->


- DefraDB installed and running
- A defined schema for your collections
- Understanding of [secondary index concepts](/defradb/next/Concepts/secondary-indexes)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Broken link to concepts page.

The link path uses secondary-indexes (plural) but the actual file is secondary-index.md (singular). This will result in a 404 error.

🔗 Proposed fix
-- Understanding of [secondary index concepts](/defradb/next/Concepts/secondary-indexes)
+- Understanding of [secondary index concepts](/defradb/next/Concepts/secondary-index)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Understanding of [secondary index concepts](/defradb/next/Concepts/secondary-indexes)
- Understanding of [secondary index concepts](/defradb/next/Concepts/secondary-index)
🤖 Prompt for AI Agents
In `@docs/defradb/How-to` Guides/seconday-index-how-to.md at line 25, The markdown
link label "[secondary index concepts]" points to the wrong path; update the
link target from "/defradb/next/Concepts/secondary-indexes" to the correct file
name "/defradb/next/Concepts/secondary-index.md" so the link resolves without a
404.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 20, 2026

Deploying docs-source-network with  Cloudflare Pages  Cloudflare Pages

Latest commit: 07faa41
Status: ✅  Deploy successful!
Preview URL: https://ee6b35c8.docs-source-network.pages.dev
Branch Preview URL: https://seconday-index-update.docs-source-network.pages.dev

View logs

Copy link

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall documentation gives valuable information, but it looks mostly AI-generated and some information is incorrect.

Also I don't see any mention or arrays with indexes applied to them.

Also I feel like we can talk more about indexes on relations. Btw, 1-to-1 2-sided relations (in the example we have only 1-sided relation) are automatically constrained by a unique index to enforce 1-to-1 invariant.

Also I think we can add a warning about indexing arrays and json as this might fill up the storage pretty quickly with large data.

I didn't read "How-to Guide". First we need to adjust this one.

```
Query: Find users with age = 30
Process: Look up "30" in age index → Return matching document IDs
Cost: O(log n) for lookup + O(m) for retrieval where m = matching documents

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: the cost for look up (like _eq matcher) is not O(log n), it's O(1)

Comment on lines +50 to +52
DefraDB stores indexes as sorted key-value pairs where:
- **Key**: The indexed field value(s)
- **Value**: Document identifier (_key)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: this is true only for unique indexes:
/col_id/ind_id/field_values -> _docID

For regular indexes it's
/col_id/ind_id/field_values/_docID -> {}
pointing to empty value

"Charlie" → [doc_id_4]
```

When you query for `name = "Bob"`, DefraDB looks up "Bob" in the index and immediately retrieves `doc_id_2` and `doc_id_3`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: it doesn't retrieve both at once. Depending on the condition it can retrieve only one of them, for example if limit: 1.
It will fetch them one-by-one

```
("published", "2024-01-15") → [doc_id_1]
("published", "2024-01-16") → [doc_id_2, doc_id_3]
("draft", "2024-01-15") → [doc_id_4]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: it's incorrect.

published/2024-01-15/doc_id_1 → {}
published/2024-01-16/doc_id_2 → {}
published/2024-01-16/doc_id_3 → {}
draft/2024-01-15/doc_id_4 → {}

Here I'm omitting col_id and index_id that are always prefixed.

@ghost ghost force-pushed the Seconday-index-update branch from c5cd21b to 899b3eb Compare February 10, 2026 18:29
@ghost ghost force-pushed the Seconday-index-update branch from 899b3eb to 8a2549b Compare February 17, 2026 10:19
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5cd21b and e250c94.

📒 Files selected for processing (3)
  • docs/defradb/Concepts/secondary-index.md
  • docs/defradb/How-to Guides/secondary-index-how-to.md
  • docs/defradb/How-to Guides/secondary-index.md
💤 Files with no reviewable changes (1)
  • docs/defradb/How-to Guides/secondary-index.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Documentation files support both .md (CommonMark) and .mdx (MDX) formats

Files:

  • docs/defradb/How-to Guides/secondary-index-how-to.md
  • docs/defradb/Concepts/secondary-index.md
🪛 markdownlint-cli2 (0.21.0)
docs/defradb/Concepts/secondary-index.md

[warning] 35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 42-42: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 55-55: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 94-94: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 200-200: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 213-213: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 219-219: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 272-272: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 277-277: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/defradb/Concepts/secondary-index.md`:
- Around line 35-39: The fenced code blocks in the Concept: secondary-index doc
lack language identifiers (markdownlint MD040); update each shown
triple-backtick block by adding an appropriate language tag (e.g., ```text or
```json) so they become fenced with a language: for example the block starting
with "Query: Find users with age = 30", the block starting "Process: Look up
\"30\" in age index → Return matching document IDs", and the other examples such
as the blocks beginning with "Index entries:", the tuple examples like
("published", "2024-01-15") → [doc_id_1], the nested path arrays like
["user","device","model","iPhone"], the doc_id mapping blocks, and the
collection/index path examples — prepend each opening ``` with a language token
(suggestion: text for plain examples, json for JSON-like snippets).

---

Duplicate comments:
In `@docs/defradb/How-to` Guides/secondary-index-how-to.md:
- Line 266: Replace the absolute link in the "Test query performance" bullet
("Use the [explain
systems](/defradb/next/How-to%20Guides/explain-systems-how-to) to analyze query
execution") with a relative link to the local How‑to page (e.g., change to "Use
the [explain systems](./explain-systems-how-to.md) to analyze query execution")
so it matches other How‑to guides' relative link format; update the link target
string in that exact bullet text.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e250c94 and 011596b.

📒 Files selected for processing (2)
  • docs/defradb/Concepts/secondary-index.md
  • docs/defradb/How-to Guides/secondary-index-how-to.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/defradb/Concepts/secondary-index.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Documentation files support both .md (CommonMark) and .mdx (MDX) formats

Files:

  • docs/defradb/How-to Guides/secondary-index-how-to.md
🪛 LanguageTool
docs/defradb/How-to Guides/secondary-index-how-to.md

[style] ~15-~15: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
Context: ... and use unique indexes sparingly (they add an additional read operation on every write). Plan indexes...

(ADD_AN_ADDITIONAL)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/defradb/How-to` Guides/secondary-index-how-to.md:
- Line 15: Replace the phrase "an additional read operation" in the sentence
that warns about unique indexes with a tighter wording such as "an extra read
operation" or simply "a read operation"; locate the sentence containing "unique
indexes sparingly (they add an additional read operation on every write)" and
update that parenthetical to use the preferred phrasing.

---

Duplicate comments:
In `@docs/defradb/How-to` Guides/secondary-index-how-to.md:
- Line 286: In the sentence that links to the explain systems guide, replace the
absolute URL "/defradb/next/How-to%20Guides/explain-systems-how-to" with a
relative link matching other How-to guides (e.g., ./explain-systems-how-to.md);
update the link text near "explain systems" in secondary-index-how-to.md so it
uses the relative path and the same markdown link style as sibling files.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 011596b and 3367dc0.

📒 Files selected for processing (1)
  • docs/defradb/Concepts/secondary-index.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Documentation files support both .md (CommonMark) and .mdx (MDX) formats

Files:

  • docs/defradb/Concepts/secondary-index.md
🪛 markdownlint-cli2 (0.21.0)
docs/defradb/Concepts/secondary-index.md

[warning] 37-37: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 45-45: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 55-55: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 61-61: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 67-67: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 108-108: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 228-228: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 242-242: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 248-248: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 291-291: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 297-297: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/defradb/Concepts/secondary-index.md`:
- Line 320: The single-field example showing publishedAt: DateTime
`@index`(direction: DESC) is incorrectly placed under the "composite index"
direction subsection; replace that single-field snippet with a composite-index
example that matches the heading — use an `@index`(includes: [...]) example
showing per-field directions (e.g., {field: "status", direction: ASC}, {field:
"publishedAt", direction: DESC}) and move or re-add the single-field publishedAt
example to the earlier single-field discussion if you want to keep it as an
illustration; ensure the composite `@index`(includes: ...) snippet replaces the
current single-field code block so the prose and example align.
- Line 198: Replace the non-standard phrase "1-to-2-sided relations" with a
clearer term such as "one-to-one two-sided (bidirectional) relations" and update
the note so it reads along the lines of: "Note: one-to-one two-sided
(bidirectional) relations are automatically constrained by a unique index to
enforce the one-to-one invariant." Ensure the replacement appears wherever the
original phrase appears in the Concepts/secondary-index.md content and preserve
the intent about automatic unique-index enforcement.
- Line 45: Update the documented index lookup complexity in
docs/defradb/Concepts/secondary-index.md: replace the incorrect "O(1)"
complexity with "O(log n)" in the two locations called out (the text around line
48 and the paragraph around line 308) and mention that this reflects BadgerDB's
LSM-tree behavior (memtable + multiple SSTable levels) so point lookups are
O(log n) in the worst case.

---

Duplicate comments:
In `@docs/defradb/Concepts/secondary-index.md`:
- Around line 37-41: The fenced code blocks in
docs/defradb/Concepts/secondary-index.md are missing language identifiers
(MD040); update each bare triple-backtick fence to include a language tag (use
"text") for the examples such as the "Query: Find users with age = 30" block,
the index lookup examples (e.g. lines containing
"/col_id/ind_id/field_values/_docID → {}",
"/1/1/user.device.model/iPhone/doc_id_1 → {}", and the
"<collection_id>/<index_id>/<json_path>/<json_value>/<doc_id>" sample), and
apply the same change to all other reported ranges (45-49, 55-57, 61-63, 67-73,
108-113, 228-232, 242-244, 248-251, 291-293, 297-300) so every ``` becomes
```text.


**With an index on age:**

```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n docs/defradb/Concepts/secondary-index.md | sed -n '40,55p'

Repository: sourcenetwork/docs.source.network

Length of output: 598


🏁 Script executed:

cat -n docs/defradb/Concepts/secondary-index.md | sed -n '300,315p'

Repository: sourcenetwork/docs.source.network

Length of output: 1059


🏁 Script executed:

wc -l docs/defradb/Concepts/secondary-index.md

Repository: sourcenetwork/docs.source.network

Length of output: 121


🏁 Script executed:

cd docs && find . -type f \( -name "*.md" -o -name "*.mdx" \) -exec grep -l "BadgerDB\|storage engine\|LSM" {} \;

Repository: sourcenetwork/docs.source.network

Length of output: 173


🏁 Script executed:

# Search for DefraDB architecture documentation
find docs -type f \( -name "*.md" -o -name "*.mdx" \) | xargs grep -i "backend\|storage" | head -20

Repository: sourcenetwork/docs.source.network

Length of output: 5221


🏁 Script executed:

# Look for any mentions of indexing complexity or performance
cat -n docs/defradb/Concepts/secondary-index.md | grep -i "o(1)\|o(log\|o(n)\|complexity\|time"

Repository: sourcenetwork/docs.source.network

Length of output: 752


🏁 Script executed:

# Search for BadgerDB mentions
find docs -type f \( -name "*.md" -o -name "*.mdx" \) | xargs grep -i "badger" 2>/dev/null

Repository: sourcenetwork/docs.source.network

Length of output: 664


🏁 Script executed:

# Check deployment.md for backend info
cat -n docs/defradb/How-to\ Guides/deployment.md | head -50

Repository: sourcenetwork/docs.source.network

Length of output: 3319


🏁 Script executed:

# Search for DefraDB core documentation about storage
find docs -type f -name "*.md" | xargs grep -i "storage engine\|database backend\|merkle" 2>/dev/null | head -15

Repository: sourcenetwork/docs.source.network

Length of output: 3389


🏁 Script executed:

# Look at the main DefraDB documentation structure
ls -la docs/defradb/

Repository: sourcenetwork/docs.source.network

Length of output: 572


🏁 Script executed:

cat -n docs/defradb/Concepts/secondary-index.md | sed -n '50,100p'

Repository: sourcenetwork/docs.source.network

Length of output: 1694


🏁 Script executed:

# Check if there's more context about index implementation
cat -n docs/defradb/Concepts/secondary-index.md | sed -n '1,60p'

Repository: sourcenetwork/docs.source.network

Length of output: 2792


🌐 Web query:

BadgerDB LSM tree lookup complexity point query performance

💡 Result:

Web search failed: Server error: no LLM provider could handle the message


Update index lookup complexity from O(1) to O(log n) to reflect BadgerDB's LSM-tree implementation.

DefraDB uses BadgerDB (an LSM-tree-based key-value store) as its default storage backend. Point lookups in LSM trees have worst-case complexity of O(log n) due to checking the memtable and potentially multiple SSTable levels, not O(1). Lines 48 and 308 should reflect this accurate complexity.

Changes needed

Line 48:

-Cost: O(1) for lookup + O(m) for retrieval where m = matching documents
+Cost: O(log n) for lookup + O(m) for retrieval where m = matching documents

Line 308:

-On reads, an `_eq` filter on an indexed field is O(1) for the lookup, plus O(m) to retrieve the m matching documents.
+On reads, an `_eq` filter on an indexed field is O(log n) for the lookup, plus O(m) to retrieve the m matching documents.
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 45-45: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/defradb/Concepts/secondary-index.md` at line 45, Update the documented
index lookup complexity in docs/defradb/Concepts/secondary-index.md: replace the
incorrect "O(1)" complexity with "O(log n)" in the two locations called out (the
text around line 48 and the paragraph around line 308) and mention that this
reflects BadgerDB's LSM-tree behavior (memtable + multiple SSTable levels) so
point lookups are O(log n) in the worst case.


Without the unique constraint, the relationship defaults to one-to-many (multiple Users could reference the same Address). The unique index ensures exactly one User per Address.

Note: 1-to-2-sided relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the non-standard "1-to-2-sided" terminology.

"1-to-2-sided relations" is not a recognised cardinality term and is likely confusing to readers. Based on context (enforcing a 1-to-1 invariant), this probably means "one-to-one two-sided (bidirectional)" relations.

🔧 Proposed fix
-Note: 1-to-2-sided relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.
+Note: One-to-one (bidirectional) relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Note: 1-to-2-sided relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.
Note: One-to-one (bidirectional) relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/defradb/Concepts/secondary-index.md` at line 198, Replace the
non-standard phrase "1-to-2-sided relations" with a clearer term such as
"one-to-one two-sided (bidirectional) relations" and update the note so it reads
along the lines of: "Note: one-to-one two-sided (bidirectional) relations are
automatically constrained by a unique index to enforce the one-to-one
invariant." Ensure the replacement appears wherever the original phrase appears
in the Concepts/secondary-index.md content and preserve the intent about
automatic unique-index enforcement.


## Direction and ordering

Index direction (ASC or DESC) plays a significant role primarily for **composite indexes**. For single-field indexes, the index fetcher can traverse entries in reverse order just as efficiently as the default order, so direction has minimal practical impact there.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Single-field direction example is misplaced under the "composite index" direction sub-section.

Line 322 says "For composite indexes, specifying direction can matter:" but the code block that immediately follows (Lines 324–328) demonstrates a single-field index (publishedAt: DateTime @index(direction: DESC)). The composite example doesn't appear until Lines 332–337. This contradicts the preceding prose (Line 320) which explicitly states direction has minimal impact on single-field indexes.

Either move the single-field snippet to illustrate the single-field case earlier, or replace it with a composite index example that matches the section heading.

🔧 Proposed fix — replace misplaced single-field example with the composite one
 For composite indexes, specifying direction can matter:
 
-```graphql
-type Article {
-  publishedAt: DateTime `@index`(direction: DESC)
-}
-```
-
-Each field in a composite index can have its own direction:
-
 ```graphql
 `@index`(includes: [
   {field: "status", direction: ASC},
   {field: "publishedAt", direction: DESC}
 ])
</details>





Also applies to: 322-322, 324-324, 328-328

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/defradb/Concepts/secondary-index.md at line 320, The single-field
example showing publishedAt: DateTime @index(direction: DESC) is incorrectly
placed under the "composite index" direction subsection; replace that
single-field snippet with a composite-index example that matches the heading —
use an @index(includes: [...]) example showing per-field directions (e.g.,
{field: "status", direction: ASC}, {field: "publishedAt", direction: DESC}) and
move or re-add the single-field publishedAt example to the earlier single-field
discussion if you want to keep it as an illustration; ensure the composite
@index(includes: ...) snippet replaces the current single-field code block so
the prose and example align.


</details>

<!-- fingerprinting:phantom:triton:churro -->

<!-- This is an auto-generated comment by CodeRabbit -->

/1/1/user.device.model/Android/doc_id_2 → {}
```

When you query for a specific path and value, DefraDB directly looks it up in the inverted index and retrieves all matching documents. For more on inverted indexes, see the [CockroachDB RFC on inverted indexes](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20171020_inverted_indexes.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should reference CocroachDB in our documentation. @jsimnz what do you think?

}
```

This ensures no two Users can reference the same Address document. Note that 1-to-2-sided relations are automatically constrained by a unique index to enforce the 1-to-1 invariant.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: I'm not familiar with the term "1-to-2-sided relations". There is "1-to-1" and "1-to-many"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants