Skip to content

Add ColumnDB projected blob read prototype#14847

Draft
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:export-D108213380
Draft

Add ColumnDB projected blob read prototype#14847
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:export-D108213380

Conversation

@xingbowang

Copy link
Copy Markdown
Contributor

Summary:
Add a ColumnDB stackable DB prototype in fbcode/internal_repo_rocksdb/repo for projected reads from blob-backed wide-column entities. ColumnDB reads the inline schema column, invokes a caller-provided translate callback to map requested columns to blob byte ranges, coalesces adjacent ranges, and calls DBImpl::MultiGetBlobRanges to issue partial blob-file reads.

This also adds lazy V2 wide-column blob-index exposure for callers that explicitly request raw blob indexes, partial blob-range read plumbing through DBImpl and Version, external blob-file support in SstFileWriter and external-file ingestion, and db_bench workloads plus tools/run_column_db_bench.sh for full-read versus projected-read comparison.

Benchmark result from the prior direct-IO host run:

value_size projected stride full_read_ops column_read_ops read_speedup full_read_cpu_s column_read_cpu_s read_cpu_s_ratio full_read_fs_inputs column_read_fs_inputs read_fs_input_ratio
102400 5 1 32401 44437 1.371 40.660 25.480 0.627 172803016 26247264 0.152
102400 5 97 32667 31848 0.975 40.670 39.480 0.971 172803016 51852160 0.300
102400 10 1 32659 44076 1.350 40.490 25.960 0.641 172803016 27046424 0.157
102400 10 97 32297 27431 0.849 42.310 60.310 1.425 172803016 84649256 0.490
1048576 5 1 7802 41900 5.370 225.720 28.860 0.128 1651236952 33620536 0.020
1048576 5 97 7793 37065 4.756 226.570 39.910 0.176 1651236952 59209224 0.036
1048576 10 1 7795 41071 5.269 224.660 28.500 0.127 1651236952 41813008 0.025
1048576 10 97 7806 31307 4.011 223.940 62.210 0.278 1651236952 99409896 0.060

Differential Revision: D108213380

Summary:
Add a `ColumnDB` stackable DB prototype in `fbcode/internal_repo_rocksdb/repo` for projected reads from blob-backed wide-column entities. `ColumnDB` reads the inline schema column, invokes a caller-provided translate callback to map requested columns to blob byte ranges, coalesces adjacent ranges, and calls `DBImpl::MultiGetBlobRanges` to issue partial blob-file reads.

This also adds lazy V2 wide-column blob-index exposure for callers that explicitly request raw blob indexes, partial blob-range read plumbing through `DBImpl` and `Version`, external blob-file support in `SstFileWriter` and external-file ingestion, and `db_bench` workloads plus `tools/run_column_db_bench.sh` for full-read versus projected-read comparison.

Benchmark result from the prior direct-IO host run:

| value_size | projected | stride | full_read_ops | column_read_ops | read_speedup | full_read_cpu_s | column_read_cpu_s | read_cpu_s_ratio | full_read_fs_inputs | column_read_fs_inputs | read_fs_input_ratio |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 102400 | 5 | 1 | 32401 | 44437 | 1.371 | 40.660 | 25.480 | 0.627 | 172803016 | 26247264 | 0.152 |
| 102400 | 5 | 97 | 32667 | 31848 | 0.975 | 40.670 | 39.480 | 0.971 | 172803016 | 51852160 | 0.300 |
| 102400 | 10 | 1 | 32659 | 44076 | 1.350 | 40.490 | 25.960 | 0.641 | 172803016 | 27046424 | 0.157 |
| 102400 | 10 | 97 | 32297 | 27431 | 0.849 | 42.310 | 60.310 | 1.425 | 172803016 | 84649256 | 0.490 |
| 1048576 | 5 | 1 | 7802 | 41900 | 5.370 | 225.720 | 28.860 | 0.128 | 1651236952 | 33620536 | 0.020 |
| 1048576 | 5 | 97 | 7793 | 37065 | 4.756 | 226.570 | 39.910 | 0.176 | 1651236952 | 59209224 | 0.036 |
| 1048576 | 10 | 1 | 7795 | 41071 | 5.269 | 224.660 | 28.500 | 0.127 | 1651236952 | 41813008 | 0.025 |
| 1048576 | 10 | 97 | 7806 | 31307 | 4.011 | 223.940 | 62.210 | 0.278 | 1651236952 | 99409896 | 0.060 |

Differential Revision: D108213380
@meta-cla meta-cla Bot added the CLA Signed label Jun 11, 2026
@meta-codesync

meta-codesync Bot commented Jun 11, 2026

Copy link
Copy Markdown

@xingbowang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108213380.

@github-actions

Copy link
Copy Markdown

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 425.4s.

Summary by check

Check Count
performance-no-automatic-move 1
Total 1

Details

table/sst_file_writer.cc (1 warning(s))
table/sst_file_writer.cc:277:14: warning: constness of 's' prevents automatic move [performance-no-automatic-move]

@xingbowang xingbowang marked this pull request as draft June 11, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant