Skip to content

BuiltinSnappyCompressorV2: Skip memcpy when not needed (fix UB)#14886

Open
evanj wants to merge 1 commit into
facebook:mainfrom
evanj:evan.jones/snappy-ub
Open

BuiltinSnappyCompressorV2: Skip memcpy when not needed (fix UB)#14886
evanj wants to merge 1 commit into
facebook:mainfrom
evanj:evan.jones/snappy-ub

Conversation

@evanj

@evanj evanj commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Running the rust-rocksdb tests under valgrind failed with the following memcheck error. The error is that snappy first calls GetAppendBuffer() to get an output buffer. It then writes into the prefix, then passes it to Append(). The current implementation was doing an unnecessary memcpy in this case, with the same source and destination pointer. This is technically undefined behavior since the arguments to memcpy are restrict pointers.

Add a unit test that reproduces the valgrind error:

Source and destination overlap in memcpy(0x8aa5122, 0x8aa5122, 194)
  at 0x49206AC: __GI_memcpy (vg_replace_strmem.c:1155)
  by 0x550788F: rocksdb::(anonymous namespace)::BuiltinSnappyCompressorV2::CompressBlock(rocksdb::Slice, char*, unsigned long*, rocksdb::CompressionType*, rocksdb::ManagedPtr<rocksdb::Compressor::WorkingArea, rocksdb::Compressor, &rocksdb::Compressor::ReleaseWorkingArea>*)::MySink::Append(char const*, unsigned long) (util/compression.cc:510)
  by 0x574445B: snappy::Compress(snappy::Source*, snappy::Sink*) (in /usr/lib/aarch64-linux-gnu/libsnappy.so.1.1.8)
  by 0x550765F: rocksdb::(anonymous namespace)::BuiltinSnappyCompressorV2::CompressBlock(rocksdb::Slice, char*, unsigned long*, rocksdb::CompressionType*, rocksdb::ManagedPtr<rocksdb::Compressor::WorkingArea, rocksdb::Compressor, &rocksdb::Compressor::ReleaseWorkingArea>*) (util/compression.cc:529)
  by 0x405D4BF: rocksdb::DBCompressionTest_SnappyCompressBlockRoundTrip_Test::TestBody() (util/compression_test.cc:2755)

@meta-cla meta-cla Bot added the CLA Signed label Jun 25, 2026
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 60.5s.

Running the rust-rocksdb tests under valgrind failed with the
following memcheck error. The error is that snappy first calls
GetAppendBuffer() to get an output buffer. It then writes into the
prefix, then passes it to Append(). The current implementation was
doing an unnecessary memcpy in this case, with the same source and
destination pointer. This is technically undefined behavior since
the arguments to memcpy are restrict pointers.

Add a unit test that reproduces the valgrind error:

Source and destination overlap in memcpy(0x8aa5122, 0x8aa5122, 194)
  at 0x49206AC: __GI_memcpy (vg_replace_strmem.c:1155)
  by 0x550788F: rocksdb::(anonymous namespace)::BuiltinSnappyCompressorV2::CompressBlock(rocksdb::Slice, char*, unsigned long*, rocksdb::CompressionType*, rocksdb::ManagedPtr<rocksdb::Compressor::WorkingArea, rocksdb::Compressor, &rocksdb::Compressor::ReleaseWorkingArea>*)::MySink::Append(char const*, unsigned long) (util/compression.cc:510)
  by 0x574445B: snappy::Compress(snappy::Source*, snappy::Sink*) (in /usr/lib/aarch64-linux-gnu/libsnappy.so.1.1.8)
  by 0x550765F: rocksdb::(anonymous namespace)::BuiltinSnappyCompressorV2::CompressBlock(rocksdb::Slice, char*, unsigned long*, rocksdb::CompressionType*, rocksdb::ManagedPtr<rocksdb::Compressor::WorkingArea, rocksdb::Compressor, &rocksdb::Compressor::ReleaseWorkingArea>*) (util/compression.cc:529)
  by 0x405D4BF: rocksdb::DBCompressionTest_SnappyCompressBlockRoundTrip_Test::TestBody() (util/compression_test.cc:2755)
@evanj evanj force-pushed the evan.jones/snappy-ub branch from c3c7058 to 27a8840 Compare June 25, 2026 19:37
@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 27a8840


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 27a8840


Summary

Clean, minimal fix for real undefined behavior (overlapping memcpy arguments) in Snappy compression. The pointer equality guard is the correct approach, the test reproduces the issue, and there are no behavioral or compatibility changes. The code is correct.

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

None.

🟢 LOW / NIT

L1. Comment style nit -- lowercase start — util/compression.cc:511
  • Issue: The comment // if data is our buffer from GetAppendBuffer, skip memcpy (also starts with a lowercase letter. Surrounding comments in this file use capitalized starts (// Virtual abort, // Compression kept/successful).
  • Suggested fix: Capitalize: // If data is our buffer from GetAppendBuffer, skip memcpy (also
L2. Test relies on Snappy internal behavior — util/compression_test.cc:2745
  • Issue: The test depends on Snappy choosing to use GetAppendBuffer for highly compressible input. This is pragmatic and sufficient given that MySink is a private struct.
  • Suggested fix: None required.
L3. memmove alternative considered — util/compression.cc:513
  • Issue: memmove would also fix UB, but the pointer equality check is strictly better -- avoids the call entirely and is more semantically clear.
  • Suggested fix: None -- current approach is superior.

Cross-Component Analysis

Context Assumptions hold? Action needed?
WritePreparedTxnDB YES None
ReadOnly DB N/A (no writes) None
CompactionService YES None
Concurrent writers YES (MySink is stack-local) None
All other contexts YES None

Assumption stress-test: Verified that data == output_ + pos_ can only occur via GetAppendBuffer, partial overlaps cannot occur per the snappy Sink contract, and pos_ += n correctly executes unconditionally outside the guard.

Positive Observations

  • Minimal, surgical fix -- exactly one condition added.
  • Pointer equality check is the optimal approach (avoids both UB and unnecessary work).
  • Well-constructed test with full compress-decompress round trip.
  • Excellent PR description with exact valgrind error and root cause.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant