fix(artifacts): harden artifact ID validation and absolute_path#2860
Conversation
- Reject "." as a valid artifact ID in validate_artifact_id (store.rs); previously this resolved to the artifacts root directory itself - Verify that meta.path stays within the artifacts root before returning absolute_path in ai_get_artifact (ops.rs), guarding against corrupt or adversarial path values stored in meta.json - Fix get_artifact schema output to reflect the actual flat JSON response shape (id, kind, title, path, size_bytes, status, created_at, absolute_path) rather than a misleading "artifact" wrapper ref - Add validate_artifact_id_rejects_dot test in store_tests.rs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR adds path-safety validation to the artifact operations and restructures the artifact schema output. It rejects the special "." directory reference, enforces that artifact paths remain within the artifacts root, and flattens the get_artifact response from a wrapped field to individual metadata fields. ChangesArtifact operation safety and schema restructuring
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/artifacts/ops.rs`:
- Around line 64-73: The guard using Path::starts_with is insufficient because
it doesn't normalize `..`; update the check in the code around `artifacts_root`,
`resolved`, `meta.path`, and `absolute_path` to 1) reject absolute `meta.path`
inputs, and 2) reject any `..` components by iterating `meta.path.components()`
and returning an error if any Component::ParentDir is present (or if any
Component::Prefix/RootDir when Windows), then construct the final path as
`artifacts_root.join(&meta.path)` and proceed to create `absolute_path` (you may
still optionally canonicalize the final path if desired and it exists); this
prevents path traversal without relying on starts_with.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 62b6cbc5-fa97-4c40-bafc-3f71f450ddfb
📒 Files selected for processing (4)
src/openhuman/artifacts/ops.rssrc/openhuman/artifacts/schemas.rssrc/openhuman/artifacts/store.rssrc/openhuman/artifacts/store_tests.rs
| // Guard against a corrupt or adversarial meta.path that escapes the artifacts root. | ||
| let artifacts_root = config.workspace_dir.join("artifacts"); | ||
| let resolved = artifacts_root.join(&meta.path); | ||
| if !resolved.starts_with(&artifacts_root) { | ||
| return Err(format!( | ||
| "[artifacts] meta.path {:?} escapes artifacts root for id={artifact_id}", | ||
| meta.path | ||
| )); | ||
| } | ||
| let absolute_path = resolved.to_string_lossy().into_owned(); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Rust std::path, does Path::starts_with normalize .. path segments, or is it a lexical component-prefix check?
💡 Result:
In Rust, Path::starts_with performs a component-based, lexical check rather than a semantic or filesystem-aware normalization [1][2]. It does not normalize.. (parent directory) segments, nor does it resolve symbolic links [2][3]. Because it operates on path components, it strictly requires that the components of the base path match the leading components of the path being checked [1][4]. For example,.. segments are treated as literal components; therefore, a path like a/b/../c will not be treated as a/c, and thus would not match a prefix of a/c [2][5]. If you require normalization that resolves.. segments (lexically), you would need to use other utilities, such as experimental or third-party methods, or use Path::canonicalize if you intend to resolve paths against the actual filesystem [2][3][5].
Citations:
- 1: https://doc.rust-lang.org/stable/std/path/struct.Path.html
- 2: https://github.com/rust-lang/rust/blob/2972b5e5/library/std/src/path.rs
- 3: https://doc.rust-lang.org/std/path/
- 4: https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/std/path/struct.Path.html
- 5: https://docs.rs/which/latest/which/struct.Path.html
Fix path traversal guard in src/openhuman/artifacts/ops.rs
Path::starts_with is a component-based lexical prefix check and does not normalize/resolve .., so artifacts_root.join("../x") can still pass the starts_with test and produce an out-of-root absolute_path.
🔧 Suggested fix
- let resolved = artifacts_root.join(&meta.path);
- if !resolved.starts_with(&artifacts_root) {
+ let rel_path = std::path::Path::new(&meta.path);
+ if rel_path.is_absolute()
+ || rel_path
+ .components()
+ .any(|c| matches!(c, std::path::Component::ParentDir))
+ {
return Err(format!(
"[artifacts] meta.path {:?} escapes artifacts root for id={artifact_id}",
meta.path
));
}
- let absolute_path = resolved.to_string_lossy().into_owned();
+ let absolute_path = artifacts_root.join(rel_path).to_string_lossy().into_owned();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Guard against a corrupt or adversarial meta.path that escapes the artifacts root. | |
| let artifacts_root = config.workspace_dir.join("artifacts"); | |
| let resolved = artifacts_root.join(&meta.path); | |
| if !resolved.starts_with(&artifacts_root) { | |
| return Err(format!( | |
| "[artifacts] meta.path {:?} escapes artifacts root for id={artifact_id}", | |
| meta.path | |
| )); | |
| } | |
| let absolute_path = resolved.to_string_lossy().into_owned(); | |
| // Guard against a corrupt or adversarial meta.path that escapes the artifacts root. | |
| let artifacts_root = config.workspace_dir.join("artifacts"); | |
| let rel_path = std::path::Path::new(&meta.path); | |
| if rel_path.is_absolute() | |
| || rel_path | |
| .components() | |
| .any(|c| matches!(c, std::path::Component::ParentDir)) | |
| { | |
| return Err(format!( | |
| "[artifacts] meta.path {:?} escapes artifacts root for id={artifact_id}", | |
| meta.path | |
| )); | |
| } | |
| let absolute_path = artifacts_root.join(rel_path).to_string_lossy().into_owned(); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/openhuman/artifacts/ops.rs` around lines 64 - 73, The guard using
Path::starts_with is insufficient because it doesn't normalize `..`; update the
check in the code around `artifacts_root`, `resolved`, `meta.path`, and
`absolute_path` to 1) reject absolute `meta.path` inputs, and 2) reject any `..`
components by iterating `meta.path.components()` and returning an error if any
Component::ParentDir is present (or if any Component::Prefix/RootDir when
Windows), then construct the final path as `artifacts_root.join(&meta.path)` and
proceed to create `absolute_path` (you may still optionally canonicalize the
final path if desired and it exists); this prevents path traversal without
relying on starts_with.
Summary
Security hardening for the artifacts persistence layer introduced in #2801:
.as artifact ID —validate_artifact_idinstore.rsnow explicitly rejects".", which previously resolved to the artifacts root directory itself, allowing operations to target the root instead of a specific artifact subdirectoryabsolute_pathinai_get_artifact—ops.rsnow verifies thatartifacts_root.join(&meta.path)still starts withartifacts_rootbefore returning the path, guarding against corrupt or adversarialmeta.pathvalues in storedmeta.jsonfiles that could expose paths outside the artifacts sandboxget_artifactschema output shape — The schema inschemas.rspreviously declared a single"artifact": ArtifactMetawrapper output, but the actual JSON response returns allArtifactMetafields flat at the top level plus anabsolute_pathfield. Schema now matches the real response shape.validate_artifact_id_rejects_dottest instore_tests.rsTest plan
GGML_NATIVE=OFF cargo check --manifest-path Cargo.tomlpasses (only pre-existing warnings)cargo fmt --manifest-path Cargo.toml --checkpassesvalidate_artifact_id_rejects_dotcovers the.rejectionschemas_get_artifact_requires_artifact_idnow asserts flat output fields includingabsolute_pathand absence of the"artifact"wrapperNotes
This PR was not included in the original #2801 because it was merged before these fixes could be applied. These are follow-up hardening changes with no functional regression risk.
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Improvements