Skip to content

feat(curvine-cli): auto sync metadata from ufs when mount#712

Merged
szbr9486 merged 2 commits intomainfrom
curvine-cli
Mar 16, 2026
Merged

feat(curvine-cli): auto sync metadata from ufs when mount#712
szbr9486 merged 2 commits intomainfrom
curvine-cli

Conversation

@lzjqsdd
Copy link
Member

@lzjqsdd lzjqsdd commented Mar 16, 2026

Problem

Mount resync on fs_mode lacked first-mount auto synchronization and had weak visibility for long sync runs. Existing e2e coverage was also too shallow for nested/batch metadata synchronization.

Design

Unify manual and auto resync into one reusable flow with in-place progress reporting. Ensure missing Curvine directories are created during traversal, and harden e2e with high-volume nested data plus strict post-sync assertions and cleanup.

Key Changes

  • Add reusable resync execution path used by both mount resync and first mount auto-resync in fs_mode.
  • Add dynamic progress output (running/done, elapsed, scanned/recreated/skipped/failed, pending dirs).
  • Create missing Curvine directories during resync traversal to avoid list failures on new UFS directories.
  • Expand build/tests/resync_e2e.sh with nested stress distribution (3-5 levels, >=100 dirs, >=1000 files), parallel upload, strict summary/count checks, and proactive/after-run cleanup for both UFS and Curvine test data.

Made with Cursor

Copilot AI review requested due to automatic review settings March 16, 2026 07:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an automatic metadata resync from UFS on first cv mount for fs_mode mounts, improves resync visibility via progress reporting, and strengthens end-to-end coverage to validate nested/high-volume resync behavior.

Changes:

  • Reuse a single resync execution path for both mount resync and first-mount auto-resync in fs_mode, with periodic progress output.
  • Ensure missing Curvine directories are created during resync traversal to avoid list failures.
  • Expand build/tests/resync_e2e.sh with nested stress uploads, stricter assertions, and more proactive cleanup.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
curvine-common/src/state/mount.rs Adds unit coverage for fs_mode vs cache_mode guard behavior used by resync logic.
curvine-cli/src/cmds/mount.rs Introduces auto-resync on first mount in fs_mode, unifies resync flow, and adds progress reporting + CV dir creation during traversal.
build/tests/resync_e2e.sh Adds new scenarios (cache-mode rejection, auto-resync), plus high-volume nested stress testing and cleanup helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +154 to +155
|| err.to_string().contains("not exists")
|| err.to_string().contains("not found")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I removed the string-based fallback and now classify missing CV directories by ErrorKind only (FileNotFound/Expired), so unrelated errors are no longer silently treated as missing.

UFS_PREFIX="${UFS_PREFIX:-curvine-test}"
CV_PATH="${CV_PATH:-/miniocluster/curvine-test}"
S3_ENDPOINT_URL="${S3_ENDPOINT_URL:-http://127.0.0.1:9009}"
S3_REGION="${S3_REGION:-cn-beigjing}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Updated the default region typo from cn-beigjing to cn-beijing to avoid confusion in logs/configs.


TMP_DIR="$(mktemp -d)"
trap 'rm -rf "$TMP_DIR"' EXIT
trap 'cleanup_base_test_data; rm -rf "$TMP_DIR"' EXIT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I extended the EXIT trap to run cleanup_previous_test_prefixes in addition to base cleanup, so dynamic auto/stress prefixes are also cleaned when the script exits early.

Comment on lines +285 to +291
run_cv mount "s3://$BUCKET/$CACHE_MODE_UFS_PREFIX" "$CACHE_MODE_CV_PATH" \
--write-type cache_mode \
--config s3.endpoint_url="$S3_ENDPOINT_URL" \
--config s3.credentials.access="$S3_ACCESS_KEY" \
--config s3.credentials.secret="$S3_SECRET_KEY" \
--config s3.force.path.style="$S3_FORCE_PATH_STYLE" \
>/dev/null 2>&1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added --config s3.region_name="$S3_REGION" to this cache-mode mount invocation for consistency and AWS compatibility.

mc cp "$TMP_DIR/auto-b.txt" "$auto_b_mc_path" >/dev/null
log "scenario F: first mount auto-resync should not fail on missing cv dirs"
out_f="$(run_cv mount "s3://$BUCKET/$AUTO_UFS_PREFIX" "$AUTO_CV_PATH" \
--config s3.endpoint_url="$S3_ENDPOINT_URL" \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added --config s3.region_name="$S3_REGION" to the auto-resync scenario mount invocation as well.

create_nested_stress_files "$STRESS_UFS_PREFIX" "$STRESS_FILE_COUNT" "$TMP_DIR/stress-seed.txt"
log "scenario G: mount and trigger auto resync for stress path"
out_g_mount="$(run_cv mount "s3://$BUCKET/$STRESS_UFS_PREFIX" "$STRESS_CV_PATH" \
--config s3.endpoint_url="$S3_ENDPOINT_URL" \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added --config s3.region_name="$S3_REGION" to the stress-test mount invocation to avoid AWS endpoint failures.

Use ErrorKind-based missing-dir detection to avoid string-based false positives.
Also harden resync e2e by fixing region defaults, adding missing region configs,
and extending EXIT cleanup to include stale dynamic prefixes.

Made-with: Cursor
@szbr9486 szbr9486 merged commit b07bae1 into main Mar 16, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants