feat(artifacts): define artifact storage types, config, and persistence layer by graycyrus · Pull Request #2801 · tinyhumansai/openhuman

graycyrus · 2026-05-28T02:25:05Z

Summary

Adds new artifacts domain (src/openhuman/artifacts/) with ArtifactMeta struct, ArtifactKind enum (Presentation, Document, Image, Other), and ArtifactStatus enum (Pending, Ready, Failed)
Implements filesystem-backed persistence under <workspace>/artifacts/<uuid>/meta.json + binary blobs with lazy directory creation
Registers three RPC endpoints: ai_list_artifacts (paginated), ai_get_artifact (metadata + absolute path), ai_delete_artifact (removes metadata + blob)
Includes 38 unit tests covering CRUD operations, pagination, path-sandboxing enforcement (traversal, absolute paths, slashes), and corrupt metadata handling

Test plan

cargo check passes
cargo test -p openhuman --lib -- "artifacts" — 38/38 tests pass
cargo fmt --check clean
CI coverage gate (≥80% diff coverage) — 38 tests target all public functions and branches

Checklist

N/A: no frontend changes — Rust-only PR
N/A: no i18n changes
N/A: no config schema changes
Debug logging with [artifacts] prefix on all operations
Path sandboxing: two-layer validation (ID validation + canonical path containment)
Follows controller schema pattern (cron domain reference)

Closes #2776

Summary by CodeRabbit

New Features

Introduced artifact management system with operations to list, retrieve, and delete artifacts
Artifacts now maintain metadata including type, status, title, creation date, and storage information
Added input validation for artifact identifiers and secure path handling

…ce layer Add new `artifacts` domain with ArtifactMeta struct, ArtifactKind/ArtifactStatus enums, filesystem-backed persistence under <workspace>/artifacts/<uuid>/, and three RPC endpoints (ai_list_artifacts, ai_get_artifact, ai_delete_artifact). Includes 38 unit tests covering CRUD operations, pagination, path-sandboxing enforcement, and corrupt metadata handling. Closes tinyhumansai#2776

…-types

coderabbitai · 2026-05-28T02:25:13Z

📝 Walkthrough

Walkthrough

This PR implements the artifact storage domain for OpenHuman, introducing typed artifact metadata (ArtifactMeta), filesystem-backed persistence with path validation, and three RPC endpoints (ai_list_artifacts, ai_get_artifact, ai_delete_artifact) that enable workspace artifact management with pagination, sorting, corruption resilience, and comprehensive test coverage.

Changes

Artifact Storage Domain

Layer / File(s)	Summary
Artifact data types and contracts `src/openhuman/artifacts/types.rs`	`ArtifactKind` (Presentation/Document/Image/Other) and `ArtifactStatus` (Pending/Ready/Failed) enums with case-insensitive parsing and string conversion. `ArtifactMeta` struct persists artifact id, kind, title, relative path, size, status, and UTC creation timestamp with serde roundtrip and JSON field validation tests.
Artifact storage, validation, and persistence `src/openhuman/artifacts/store.rs`, `src/openhuman/artifacts/store_tests.rs`	Filesystem storage under `<workspace_dir>/artifacts/` with lazy directory creation. Two-layer path validation: `validate_artifact_id` rejects empty strings, absolute paths, and traversal patterns; `assert_within_root` enforces containment. Async operations: `save_artifact_meta` writes pretty-printed JSON, `list_artifacts` iterates and sorts by created_at descending with pagination, `get_artifact` retrieves metadata, `delete_artifact` removes directories. Store tests verify roundtrips, pagination, sorting, deletion, corruption skipping, and path traversal rejection.
RPC artifact operations handlers `src/openhuman/artifacts/ops.rs`, `src/openhuman/artifacts/ops_tests.rs`	Three public RPC handlers: `ai_list_artifacts` normalizes offset/limit bounds, logs request count, delegates to store; `ai_get_artifact` validates id, retrieves metadata, computes absolute_path, injects into JSON; `ai_delete_artifact` validates and deletes via store. Ops tests validate empty listings, default pagination, and empty id rejection.
RPC controller schemas and handlers `src/openhuman/artifacts/schemas.rs`	Exports `all_controller_schemas()` and `all_registered_controllers()` registry functions. `schemas(function)` defines ControllerSchema structures for list/get/delete with input/output specs and fallback "unknown" schema. Async handlers load config with timeout, parse parameters using `read_required`/`read_optional_u64` with bounds checking, call ops, convert results to JSON via `to_json` and `type_name`. Comprehensive tests cover schema structure, handler registration, parameter validation (including negative rejection), and JSON variant mapping.
Module structure and public API `src/openhuman/artifacts/mod.rs`, `src/openhuman/mod.rs`	Artifacts module hub declares submodules (ops, schemas, store, types) and re-exports schema aggregations as `all_artifacts_*` and core types. Root openhuman module declares artifacts as public submodule.
Core RPC registry integration `src/core/all.rs`	`build_registered_controllers()` extends registry with `all_artifacts_registered_controllers()`. `build_declared_controller_schemas()` extends catalog with `all_artifacts_controller_schemas()`. `namespace_description()` adds `ai` namespace mapping.
Design documentation `.claude/memory.md`	Records artifacts domain: persistence layout (JSON meta.json + binary blobs), RPC namespace/method naming (openhuman.ai_*), two-step filesystem path validation approach, and cargo test invocation for lib-crate tests.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ops
  participant store
  participant filesystem
  Client->>ops: ai_list_artifacts(offset, limit)
  ops->>store: list_artifacts(workspace_dir, offset, limit)
  store->>filesystem: read artifacts/ directory entries
  store->>filesystem: read each meta.json
  store->>store: sort by created_at descending
  store-->>ops: (Vec<ArtifactMeta>, total)
  ops-->>Client: JSON with artifacts, total, offset, limit
  Client->>ops: ai_get_artifact(artifact_id)
  ops->>store: get_artifact(workspace_dir, artifact_id)
  store->>store: validate_artifact_id
  store->>store: assert_within_root
  store->>filesystem: read artifact/meta.json
  store-->>ops: ArtifactMeta
  ops->>ops: compute absolute_path
  ops-->>Client: JSON with meta + absolute_path
  Client->>ops: ai_delete_artifact(artifact_id)
  ops->>store: delete_artifact(workspace_dir, artifact_id)
  store->>store: validate and assert containment
  store->>filesystem: remove artifact directory recursively
  store-->>ops: success
  ops-->>Client: JSON with artifact_id, deleted: true

sequenceDiagram
  participant Client
  participant schemas
  participant ops
  Client->>schemas: RPC request with function and params JSON
  schemas->>schemas: load config with timeout
  schemas->>schemas: parse/validate parameters
  schemas->>ops: call ai_* operation
  ops-->>schemas: RpcOutcome result
  schemas->>schemas: convert to JSON
  schemas-->>Client: JSON response

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

tinyhumansai/openhuman#2025: Modifies src/core/all.rs to extend the same controller registry and schema catalog builders (adding desktop_companion domain alongside artifacts), plus updates namespace description mappings.

Suggested labels

feature, rust-core, working

Suggested reviewers

senamakel

Poem

🐰 A hop through types, then storage deep,
Where metadata dreams the filesystem keep,
Three RPC paths unfold with care—list, get, and delete—
Now artifacts have a cozy, validated seat! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: defining artifact storage types, config, and persistence layer, which matches the core objectives.
Linked Issues check	✅ Passed	The PR fully addresses all coding requirements from issue `#2776`: ArtifactMeta/ArtifactKind/ArtifactStatus types, RPC endpoints (ai_list_artifacts, ai_get_artifact, ai_delete_artifact), filesystem persistence, path sandboxing, and comprehensive unit tests covering CRUD and validation.
Out of Scope Changes check	✅ Passed	All changes are scoped to the artifacts domain: types, persistence layer, RPC operations, and module registration. No unrelated changes to other systems or explicitly out-of-scope features (creation logic, UI, PPT libraries) are present.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/memory.md:
- Line 217: Update the note to correct the behavior and point to the canonical
policy: clarify that assert_within_root does not canonicalize but only checks
prefix containment (starts_with), remove the instruction to “replicate this
pattern” for new domains, and instead direct teams to use the centralized file
I/O validation in src/openhuman/security/policy.rs (i.e., call validate_path /
validate_parent_path) while keeping validate_artifact_id's two-layer checks as
an example for artifact IDs only.

In `@src/openhuman/artifacts/ops.rs`:
- Around line 63-69: The current derivation of absolute_path uses meta.path
directly which allows absolute paths or parent-traversal to escape the artifacts
root; update the logic that builds absolute_path (the code touching
absolute_path, config.workspace_dir, and meta.path) to first validate/sanitize
meta.path: reject or canonicalize if Path::new(&meta.path).is_absolute() and
reject or disallow any Path::new(&meta.path).components() containing
Component::ParentDir, or alternatively rebuild a safe relative PathBuf by
filtering components to only Normal components before joining; if validation
fails return an error instead of joining, then join the sanitized relative path
to config.workspace_dir.join("artifacts") and convert to string as before.

In `@src/openhuman/artifacts/schemas.rs`:
- Around line 87-92: The output schema for get_artifact currently uses
TypeSchema::Ref("ArtifactMeta") but the comment claims an extra absolute_path
field; update the contract to match by replacing the output FieldSchema type
with an explicit schema that includes ArtifactMeta plus absolute_path (either by
changing the type to Ref("ArtifactMetaWithPath") and adding that new type
definition, or by using an inline TypeSchema::Object that spreads ArtifactMeta
fields and adds absolute_path: TypeSchema::String required: true); ensure the
FieldSchema name stays "artifact" and adjust any schema registry entries that
reference ArtifactMeta if you introduce ArtifactMetaWithPath.

In `@src/openhuman/artifacts/store.rs`:
- Around line 26-57: The validator validate_artifact_id currently allows the
single-dot "." which leads to delete_artifact calling remove_dir_all on the
artifacts root; update validate_artifact_id to reject "." (and also reject "./"
and ".\\" to be safe) by returning an Err with a clear message (same format as
other errors), so delete_artifact and any other callers cannot use "." as an
artifact_id and accidentally operate on the whole artifacts tree.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 37ca5f18-69fc-4ea8-ad37-543ac0921a5a

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2e2f2 and f2dd2b0.

📒 Files selected for processing (10)

.claude/memory.md
src/core/all.rs
src/openhuman/artifacts/mod.rs
src/openhuman/artifacts/ops.rs
src/openhuman/artifacts/ops_tests.rs
src/openhuman/artifacts/schemas.rs
src/openhuman/artifacts/store.rs
src/openhuman/artifacts/store_tests.rs
src/openhuman/artifacts/types.rs
src/openhuman/mod.rs

coderabbitai · 2026-05-28T05:27:54Z

+
+- **Filesystem-backed persistence, no SQLite** — `src/openhuman/artifacts/` stores JSON metadata (`meta.json`) + binary blobs under `<workspace_dir>/artifacts/<uuid>/`. Pattern mirrors `memory/ops/files.rs` but simpler.
+- **`"ai"` namespace in controller registry** — RPC methods are `openhuman.ai_list_artifacts`, `openhuman.ai_get_artifact`, `openhuman.ai_delete_artifact`. Future `ai_*` methods should use this same namespace.
+- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Correct the security guidance and assert_within_root description.

This line is currently misleading in two ways: assert_within_root here does not canonicalize paths (it checks starts_with), and advising teams to “replicate this pattern” conflicts with the documented policy that file I/O validation must go through src/openhuman/security/policy.rs (validate_path / validate_parent_path). Please update this note to avoid steering future domains toward a non-policy pattern.

Suggested doc fix

-- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain. +- **Artifacts path safety (domain-specific)** — this domain currently uses (1) `validate_artifact_id` input checks and (2) `assert_within_root` containment checks before filesystem operations. Note: `assert_within_root` is a containment check (not canonicalization). For new filesystem-backed domains, follow the security policy entry points in `src/openhuman/security/policy.rs` (`validate_path` / `validate_parent_path`).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain.

- **Artifacts path safety (domain-specific)** — this domain currently uses (1) `validate_artifact_id` input checks and (2) `assert_within_root` containment checks before filesystem operations. Note: `assert_within_root` is a containment check (not canonicalization). For new filesystem-backed domains, follow the security policy entry points in `src/openhuman/security/policy.rs` (`validate_path` / `validate_parent_path`).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/memory.md at line 217, Update the note to correct the behavior and point to the canonical policy: clarify that assert_within_root does not canonicalize but only checks prefix containment (starts_with), remove the instruction to “replicate this pattern” for new domains, and instead direct teams to use the centralized file I/O validation in src/openhuman/security/policy.rs (i.e., call validate_path / validate_parent_path) while keeping validate_artifact_id's two-layer checks as an example for artifact IDs only.

coderabbitai · 2026-05-28T05:27:54Z

+    // Compute absolute path for the caller's convenience.
+    let absolute_path = config
+        .workspace_dir
+        .join("artifacts")
+        .join(&meta.path)
+        .to_string_lossy()
+        .into_owned();


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden absolute_path derivation against unsafe meta.path.

meta.path is used directly in join(...). If it is absolute (or contains parent traversal), the computed absolute_path can escape the artifacts root.

Suggested fix

+ use std::path::{Component, Path}; + + let rel = Path::new(&meta.path); + if rel.is_absolute() + || rel + .components() + .any(|c| matches!(c, Component::ParentDir | Component::RootDir | Component::Prefix(_))) + { + return Err(format!( + "[artifacts] invalid artifact path for id={artifact_id}: {:?}", + meta.path + )); + } + + let artifacts_root = config.workspace_dir.join("artifacts"); let absolute_path = config - .workspace_dir - .join("artifacts") + .workspace_dir + .join("artifacts") .join(&meta.path) .to_string_lossy() .into_owned();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Compute absolute path for the caller's convenience.

let absolute_path = config

.workspace_dir

.join("artifacts")

.join(&meta.path)

.to_string_lossy()

.into_owned();

// Compute absolute path for the caller's convenience.

use std::path::{Component, Path};

let rel = Path::new(&meta.path);

if rel.is_absolute()

|| rel

.components()

.any(|c| matches!(c, Component::ParentDir | Component::RootDir | Component::Prefix(_)))

{

return Err(format!(

"[artifacts] invalid artifact path for id={artifact_id}: {:?}",

meta.path

));

}

let artifacts_root = config.workspace_dir.join("artifacts");

let absolute_path = config

.workspace_dir

.join("artifacts")

.join(&meta.path)

.to_string_lossy()

.into_owned();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/artifacts/ops.rs` around lines 63 - 69, The current derivation of absolute_path uses meta.path directly which allows absolute paths or parent-traversal to escape the artifacts root; update the logic that builds absolute_path (the code touching absolute_path, config.workspace_dir, and meta.path) to first validate/sanitize meta.path: reject or canonicalize if Path::new(&meta.path).is_absolute() and reject or disallow any Path::new(&meta.path).components() containing Component::ParentDir, or alternatively rebuild a safe relative PathBuf by filtering components to only Normal components before joining; if validation fails return an error instead of joining, then join the sanitized relative path to config.workspace_dir.join("artifacts") and convert to string as before.

coderabbitai · 2026-05-28T05:27:54Z

+            outputs: vec![FieldSchema {
+                name: "artifact",
+                ty: TypeSchema::Ref("ArtifactMeta"),
+                comment: "Artifact metadata plus absolute_path field.",
+                required: true,
+            }],


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

get_artifact output schema does not encode the documented absolute_path field.

The schema advertises artifact as ArtifactMeta, but the comment says it includes an extra absolute_path. That makes discovery/docs contract ambiguous for clients. Please model the extra field explicitly (or remove that claim if not returned).

Suggested schema shape

- outputs: vec![FieldSchema { - name: "artifact", - ty: TypeSchema::Ref("ArtifactMeta"), - comment: "Artifact metadata plus absolute_path field.", - required: true, - }], + outputs: vec![FieldSchema { + name: "artifact", + ty: TypeSchema::Object { + fields: vec![ + FieldSchema { + name: "meta", + ty: TypeSchema::Ref("ArtifactMeta"), + comment: "Artifact metadata.", + required: true, + }, + FieldSchema { + name: "absolute_path", + ty: TypeSchema::String, + comment: "Resolved absolute file path for the artifact.", + required: true, + }, + ], + }, + comment: "Artifact metadata plus resolved absolute path.", + required: true, + }],

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

outputs: vec![FieldSchema {

name: "artifact",

ty: TypeSchema::Ref("ArtifactMeta"),

comment: "Artifact metadata plus absolute_path field.",

required: true,

}],

outputs: vec![FieldSchema {

name: "artifact",

ty: TypeSchema::Object {

fields: vec![

FieldSchema {

name: "meta",

ty: TypeSchema::Ref("ArtifactMeta"),

comment: "Artifact metadata.",

required: true,

},

FieldSchema {

name: "absolute_path",

ty: TypeSchema::String,

comment: "Resolved absolute file path for the artifact.",

required: true,

},

],

},

comment: "Artifact metadata plus resolved absolute path.",

required: true,

}],

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/artifacts/schemas.rs` around lines 87 - 92, The output schema for get_artifact currently uses TypeSchema::Ref("ArtifactMeta") but the comment claims an extra absolute_path field; update the contract to match by replacing the output FieldSchema type with an explicit schema that includes ArtifactMeta plus absolute_path (either by changing the type to Ref("ArtifactMetaWithPath") and adding that new type definition, or by using an inline TypeSchema::Object that spreads ArtifactMeta fields and adds absolute_path: TypeSchema::String required: true); ensure the FieldSchema name stays "artifact" and adjust any schema registry entries that reference ArtifactMeta if you introduce ArtifactMetaWithPath.

coderabbitai · 2026-05-28T05:27:55Z

+fn validate_artifact_id(id: &str) -> Result<(), String> {
+    if id.is_empty() {
+        return Err("[artifacts] artifact_id must not be empty".to_string());
+    }
+    if id.contains('/') {
+        return Err(format!(
+            "[artifacts] artifact_id must not contain '/': {id:?}"
+        ));
+    }
+    if id.contains('\\') {
+        return Err(format!(
+            "[artifacts] artifact_id must not contain '\\': {id:?}"
+        ));
+    }
+    if id == ".." || id.starts_with("../") || id.starts_with("..\\") {
+        return Err(format!(
+            "[artifacts] artifact_id must not be a path traversal: {id:?}"
+        ));
+    }
+    // Reject absolute paths (Unix /foo or Windows C:\foo / \\server\share)
+    if id.starts_with('/') || id.starts_with('\\') {
+        return Err(format!(
+            "[artifacts] artifact_id must not be an absolute path: {id:?}"
+        ));
+    }
+    // Reject Windows drive-letter paths like C:
+    if id.len() >= 2 && id.as_bytes()[1] == b':' {
+        return Err(format!(
+            "[artifacts] artifact_id must not be an absolute path: {id:?}"
+        ));
+    }
+    Ok(())


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reject . as an artifact ID to prevent full-root deletion.

validate_artifact_id allows ".". That value flows into delete_artifact, where remove_dir_all(<workspace>/artifacts/.) can delete the entire artifacts tree.

Suggested fix

fn validate_artifact_id(id: &str) -> Result<(), String> { if id.is_empty() { return Err("[artifacts] artifact_id must not be empty".to_string()); } + if id == "." || id == ".." { + return Err(format!( + "[artifacts] artifact_id must not be a reserved path component: {id:?}" + )); + } if id.contains('/') { return Err(format!( "[artifacts] artifact_id must not contain '/': {id:?}" )); } @@ - if id == ".." || id.starts_with("../") || id.starts_with("..\\") { + if id.starts_with("../") || id.starts_with("..\\") { return Err(format!( "[artifacts] artifact_id must not be a path traversal: {id:?}" )); }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/artifacts/store.rs` around lines 26 - 57, The validator validate_artifact_id currently allows the single-dot "." which leads to delete_artifact calling remove_dir_all on the artifacts root; update validate_artifact_id to reject "." (and also reject "./" and ".\\" to be safe) by returning an Err with a clear message (same format as other errors), so delete_artifact and any other callers cannot use "." as an artifact_id and accidentally operate on the whole artifacts tree.

oxoxDev

Author: @graycyrus (MEMBER — core team)

Substantial new subsystem — artifact persistence layer with proper structure, tests, and schema definitions. Core architecture is sound but one critical security issue + three majors need addressing before merge. coderabbit caught all four; surfacing here to gate the CHANGES_REQUESTED.

Blocker — verified against current code

🔴 `src/openhuman/artifacts/store.rs:26-57` — `validate_artifact_id` allows `"."` → full artifacts-root deletion

Coderabbit's finding is correct and I traced it end-to-end:

validate_artifact_id only rejects .., ../, ..\\ — . passes
delete_artifact(workspace, ".") → root.join(".") = root itself
assert_within_root(root, root.join(".")) → starts_with check passes (PathBuf normalizes "./" to just root)
tokio::fs::remove_dir_all(&artifact_dir) wipes the entire artifacts tree

Reproducer: ai_delete_artifact(&config, ".") would delete every artifact in the workspace.

Required fix — reject . explicitly + simplify the .. check (the standalone-equal .. case is already covered by the new arm):

fn validate_artifact_id(id: &str) -> Result<(), String> {
    if id.is_empty() {
        return Err("[artifacts] artifact_id must not be empty".to_string());
    }
    if id == "." || id == ".." {
        return Err(format!(
            "[artifacts] artifact_id must not be a reserved path component: {id:?}"
        ));
    }
    if id.contains('/') { /* unchanged */ }
    if id.contains('\\') { /* unchanged */ }
    if id.starts_with("../") || id.starts_with("..\\") {
        return Err(format!(
            "[artifacts] artifact_id must not be a path traversal: {id:?}"
        ));
    }
    // rest unchanged
}

Also add a regression test:

#[test]
fn validate_artifact_id_rejects_dot() {
    assert!(validate_artifact_id(".").is_err());
}

#[tokio::test]
async fn delete_artifact_rejects_dot_does_not_wipe_root() {
    let tmp = tempfile::tempdir().unwrap();
    save_artifact_meta(tmp.path(), &dummy_meta("real-id")).await.unwrap();
    assert!(delete_artifact(tmp.path(), ".").await.is_err());
    // root + sibling artifact still exist
    assert!(tmp.path().join("artifacts/real-id/meta.json").exists());
}

Major — verified

🟠 `src/openhuman/artifacts/ops.rs:63-69` — `absolute_path` derivation allows escape via `meta.path`

meta.path is joined directly without validation:

let absolute_path = config.workspace_dir.join("artifacts").join(&meta.path)

If a stored ArtifactMeta.path is absolute or contains .. (a corrupted/malicious meta.json that survived disk inspection), absolute_path escapes the artifacts root and is returned to the caller. The caller may then pass it to file-read tools.

Coderabbit's suggested fix (reject is_absolute() + Component::ParentDir / RootDir / Prefix) is correct. Apply it.

🟠 `src/openhuman/artifacts/schemas.rs:87-92` — `get_artifact` schema contract mismatch

Output schema advertises artifact as Ref("ArtifactMeta") but the doc-comment + code adds an absolute_path field. Discovery clients won't know about the extra field. Coderabbit's TypeSchema::Object shape fix is right.

🟠 `.claude/memory.md:217` — security doc steers future domains away from `security/policy.rs`

The memory note advises future authors to "replicate this pattern" for new filesystem-backed domains. That conflicts with the documented policy that file I/O validation should route through src/openhuman/security/policy.rs::validate_path_within_root (verified at line 2030). The artifacts module reimplements containment checks locally instead of delegating — that's a defensible per-module choice, but the memory note shouldn't propagate it as the recommended pattern.

Coderabbit's doc rewrite (clarify the local pattern is domain-specific; new domains should use policy.rs entry points) is correct.

Stronger suggestion — consider whether this PR should also be migrated to use validate_path_within_root:

// replace assert_within_root with the policy-canonical entry point
let resolved = crate::openhuman::security::policy::validate_path_within_root(
    &artifact_dir, &root
)?;

validate_path_within_root canonicalizes both paths before starts_with, which closes the symlink-escape gap that assert_within_root (pure starts_with) doesn't catch. If <workspace>/artifacts/<id> is a symlink pointing outside the workspace, this PR's check passes; the canonical check would catch it. Worth at minimum a comment explaining why symlink resolution was deemed unnecessary if the local check stays.

Minor / nitpicks

src/openhuman/artifacts/store.rs:233 — _assert_status_used dead-code helper feels like a smell; if ArtifactStatus is only referenced in tests, mark the type #[cfg_attr(not(test), allow(dead_code))] or move the test-only usage into a #[cfg(test)] block that uses it directly.
src/openhuman/artifacts/store.rs:18 — artifacts_root calls create_dir_all on every invocation including read paths (list_artifacts, get_artifact). Mildly wasteful + races on first concurrent call. Splitting into artifacts_root_create() (write paths) vs artifacts_root_existing() (read paths) would be cleaner.

Verified / looks good

Module structure (types / store / ops / schemas / mod) is clean and matches openhuman conventions
Pagination (offset/limit with MAX_LIMIT=200) follows established RPC patterns
Corrupt-meta skipping with log::warn! is the right call (vs propagating)
ArtifactKind::parse / ArtifactStatus::parse correctly fall back to Other / Pending on unknown values
Test scaffolding present (store_tests.rs, ops_tests.rs)
1214 lines all-new with no deletions — clean greenfield
coderabbit CHANGES_REQUESTED captures all the security issues

Recommendation

CHANGES_REQUESTED — primarily on the critical . deletion vector. Once that + the two security majors (ops absolute_path, memory doc) land, this is approve-ready.

graycyrus added 2 commits May 28, 2026 07:54

Merge remote-tracking branch 'origin/main' into feat/artifact-storage…

f2dd2b0

…-types

graycyrus marked this pull request as ready for review May 28, 2026 05:22

graycyrus requested a review from a team May 28, 2026 05:22

coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team. labels May 28, 2026

coderabbitai Bot requested changes May 28, 2026

View reviewed changes

oxoxDev self-assigned this May 28, 2026

oxoxDev requested changes May 28, 2026

View reviewed changes

M3gA-Mind self-assigned this May 28, 2026

M3gA-Mind merged commit 5237606 into tinyhumansai:main May 28, 2026
37 of 41 checks passed

M3gA-Mind approved these changes May 28, 2026

View reviewed changes

M3gA-Mind mentioned this pull request May 28, 2026

fix(artifacts): harden artifact ID validation and absolute_path #2860

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(artifacts): define artifact storage types, config, and persistence layer#2801

feat(artifacts): define artifact storage types, config, and persistence layer#2801
M3gA-Mind merged 2 commits into
tinyhumansai:mainfrom
graycyrus:feat/artifact-storage-types

graycyrus commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

oxoxDev left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

graycyrus commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Checklist

Summary by CodeRabbit

New Features

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

oxoxDev left a comment

Choose a reason for hiding this comment

Blocker — verified against current code

🔴 src/openhuman/artifacts/store.rs:26-57 — validate_artifact_id allows "." → full artifacts-root deletion

Major — verified

🟠 src/openhuman/artifacts/ops.rs:63-69 — absolute_path derivation allows escape via meta.path

🟠 src/openhuman/artifacts/schemas.rs:87-92 — get_artifact schema contract mismatch

🟠 .claude/memory.md:217 — security doc steers future domains away from security/policy.rs

Minor / nitpicks

Verified / looks good

Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

graycyrus commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading

🔴 `src/openhuman/artifacts/store.rs:26-57` — `validate_artifact_id` allows `"."` → full artifacts-root deletion

🟠 `src/openhuman/artifacts/ops.rs:63-69` — `absolute_path` derivation allows escape via `meta.path`

🟠 `src/openhuman/artifacts/schemas.rs:87-92` — `get_artifact` schema contract mismatch

🟠 `.claude/memory.md:217` — security doc steers future domains away from `security/policy.rs`