Skip to content

Add blog post on Blob Direct Write and partitioned blob files#14873

Open
xingbowang wants to merge 2 commits into
facebook:mainfrom
xingbowang:2026_06_20_bdw_blog
Open

Add blog post on Blob Direct Write and partitioned blob files#14873
xingbowang wants to merge 2 commits into
facebook:mainfrom
xingbowang:2026_06_20_bdw_blog

Conversation

@xingbowang

Copy link
Copy Markdown
Contributor

Summary

  • Add a RocksDB blog post explaining Blob Direct Write and partitioned blob files.
  • Cover the write-path transformation to BlobIndex, partition selection, blob-file lifecycle, read fallback for in-flight direct-write files, and the current v1 scope.
  • Describe policy-driven grouping use cases such as TTL bucketing with Universal Compaction and wide-column metadata.

Test Plan

  • git diff --check upstream/main...2026_06_20_bdw_blog

@meta-cla meta-cla Bot added the CLA Signed label Jun 21, 2026
@github-actions

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 0.0s.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit b6f9299


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit b6f9299


Summary

Blog post is well-written, technically accurate, and the code examples match the actual codebase. The API signatures, class names, option names, and lifecycle methods all exist and match their implementations.

High-severity findings (1):

  • [authors.yml] Author xbw is not defined in docs/_data/authors.yml. The blog post will not display the author name correctly. Per CLAUDE.md: "Blog post authors must be defined in docs/_data/authors.yml to be displayed."
Full review (click to expand)

Findings

🔴 HIGH

H1. Missing author definition -- docs/_data/authors.yml
  • Issue: The blog post front matter specifies author: xbw, but xbw is not defined in docs/_data/authors.yml.
  • Root cause: Author entry was not added alongside the blog post.
  • Suggested fix: Add an entry to docs/_data/authors.yml:
    xbw:
      full_name: Xingbo Wang
    (Adjust full_name to the author's actual name.)

🟡 MEDIUM

No medium-severity findings.

🟢 LOW / NIT

L1. No #include shown in code examples
  • Issue: The code examples don't show the necessary #include directives. Standard for blog posts but could help readers trying the examples directly.

Cross-Component Analysis

Documentation-only change. No code modified. Only impact is author rendering in Jekyll (blocked by H1).

Positive Observations

  • Technically thorough and well-structured, covering write path, read path, lifecycle, partitioning, wide-column interaction, and v1 scope limitations.
  • All code examples verified accurate against actual API signatures in include/rocksdb/blob_file_partition_strategy.h.
  • All referenced option names (enable_blob_direct_write, blob_direct_write_partitions, blob_direct_write_partition_strategy) exist in include/rocksdb/advanced_options.h.
  • All referenced internal classes and methods (BlobWriteBatchTransformer, BlobFilePartitionManager, RotateCurrentGeneration, PrepareFlushAdditions, ResolveBlobDirectWriteIndex) exist in the codebase.
  • The "Current Scope" section is commendably transparent about v1 limitations.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@xingbowang xingbowang force-pushed the 2026_06_20_bdw_blog branch from b6f9299 to cc398c8 Compare June 23, 2026 16:11
@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit cc398c8


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit cc398c8


Summary

This PR adds a well-written blog post explaining the Blob Direct Write feature and partitioned blob files. The technical claims align with the codebase. The author (xbw) is properly defined in authors.yml. No non-ASCII characters were found.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Code example sets blob_direct_write_partitions = 8 but default is 1 — could confuse readers
  • Issue: The code example in the "Partitioned Blob Files" section sets options.blob_direct_write_partitions = 8; without comment. While valid, the default value in advanced_options.h:1239 is 1. A reader might think 8 is a recommended or default value.
  • Suggested fix: Add a brief inline comment like // default is 1 or mention in the surrounding prose that the default is a single partition and the example uses 8 to illustrate multi-partition behavior.

🟢 LOW / NIT

L1. TtlBucketPartitionStrategy example calls undefined ExtractTtlBucket
  • Issue: The example strategy calls ExtractTtlBucket(key, value) which is an undefined function. This is fine for a blog post (it's illustrative), but a reader might attempt to compile it verbatim.
  • Suggested fix: Add a comment like // application-defined helper next to the call.
L2. Blog could mention blob_compression_type interaction
  • Issue: The blog mentions "If blob compression is enabled, the strategy still receives the original uncompressed value" which is good. However, it doesn't mention blob_compression_type as the configurable option name. Minor omission.

Technical Accuracy Verification

Claim Verified? Notes
BlobWriteBatchTransformer is the core class YES db/blob/blob_write_batch_transformer.{h,cc}
BlobFilePartitionManager manages partitions YES db/blob/blob_file_partition_manager.{h,cc}
WriteBlob() method exists YES blob_file_partition_manager.h:76
RotateCurrentGeneration() method exists YES blob_file_partition_manager.h:92
PrepareFlushAdditions() method exists YES blob_file_partition_manager.h:100
ResolveBlobDirectWriteIndex() method exists YES blob_file_partition_manager.h:151
BlobFilePartitionStrategy API matches YES blob_file_partition_strategy.h
enable_blob_direct_write is immutable option YES advanced_options.h:1227
IngestWriteBatchWithIndex blocked YES db_impl_write.cc:272-275
allow_concurrent_memtable_write option name YES options.h:1429
Single manager mutex in v1 YES blob_file_partition_manager.h:278
Author xbw in authors.yml YES authors.yml:109-111

Positive Observations

  • Well-structured with clear sections covering write path, read path, lifecycle, partitioning use cases, and limitations.
  • Limitations section is thorough and honest about v1 constraints including crash recovery.
  • Code examples are syntactically correct and match the actual API.
  • No non-ASCII characters (compliant with CLAUDE.md).
  • Front matter follows the same convention as other recent blog posts.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@meta-codesync

meta-codesync Bot commented Jun 24, 2026

Copy link
Copy Markdown

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109564817.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant