Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .claude/memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,13 @@ Quick reference for anyone starting with Claude on this project. Updated by the
- **Kill stuck processes** — `lsof -i :7788` then `kill <PID>`. Useful when `dev:app` reports a stale listener and you want to force a fresh boot rather than relying on the handle's auto-recovery.
- **Skills runtime removed** — the QuickJS / `rquickjs` runtime is gone; `src/openhuman/skills/` is metadata-only ("Legacy skill metadata helpers retained after QuickJS runtime removal"). Skill execution surfaces are being rebuilt; don't assume a `.skill` can run end-to-end without checking the current code.

## Artifacts Domain (Issue #2776)

- **Filesystem-backed persistence, no SQLite** — `src/openhuman/artifacts/` stores JSON metadata (`meta.json`) + binary blobs under `<workspace_dir>/artifacts/<uuid>/`. Pattern mirrors `memory/ops/files.rs` but simpler.
- **`"ai"` namespace in controller registry** — RPC methods are `openhuman.ai_list_artifacts`, `openhuman.ai_get_artifact`, `openhuman.ai_delete_artifact`. Future `ai_*` methods should use this same namespace.
- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Correct the security guidance and assert_within_root description.

This line is currently misleading in two ways: assert_within_root here does not canonicalize paths (it checks starts_with), and advising teams to “replicate this pattern” conflicts with the documented policy that file I/O validation must go through src/openhuman/security/policy.rs (validate_path / validate_parent_path). Please update this note to avoid steering future domains toward a non-policy pattern.

Suggested doc fix
-- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain.
+- **Artifacts path safety (domain-specific)** — this domain currently uses (1) `validate_artifact_id` input checks and (2) `assert_within_root` containment checks before filesystem operations. Note: `assert_within_root` is a containment check (not canonicalization). For new filesystem-backed domains, follow the security policy entry points in `src/openhuman/security/policy.rs` (`validate_path` / `validate_parent_path`).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Two-layer path validation required** — (1) `validate_artifact_id` rejects empty strings, `/`, `\`, `..`, absolute Unix paths, Windows `C:` and UNC `\\` paths; (2) `assert_within_root` canonicalizes and checks containment. Replicate this pattern for any new filesystem-backed domain.
- **Artifacts path safety (domain-specific)**this domain currently uses (1) `validate_artifact_id` input checks and (2) `assert_within_root` containment checks before filesystem operations. Note: `assert_within_root` is a containment check (not canonicalization). For new filesystem-backed domains, follow the security policy entry points in `src/openhuman/security/policy.rs` (`validate_path` / `validate_parent_path`).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/memory.md at line 217, Update the note to correct the behavior and
point to the canonical policy: clarify that assert_within_root does not
canonicalize but only checks prefix containment (starts_with), remove the
instruction to “replicate this pattern” for new domains, and instead direct
teams to use the centralized file I/O validation in
src/openhuman/security/policy.rs (i.e., call validate_path /
validate_parent_path) while keeping validate_artifact_id's two-layer checks as
an example for artifact IDs only.

- **`cargo test --lib` required for lib crate tests** — `cargo test -p openhuman -- "artifacts"` lists tests but filters to 0. Must use `cargo test -p openhuman --lib -- "artifacts"` because tests are in the lib crate, not integration test binaries.

## Rust Testing Patterns

- **Memory tree tests filter** — `cargo test -p openhuman -- "memory::tree"` runs the memory tree unit tests (602 tests); full module paths are `openhuman::memory::tree::ingest::tests::*` and `openhuman::memory::tree::canonicalize::email_clean::tests::*`.
Expand Down
4 changes: 4 additions & 0 deletions src/core/all.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,8 @@ fn build_registered_controllers() -> Vec<RegisteredController> {
controllers.extend(crate::openhuman::security::all_security_registered_controllers());
// Interactive approval workflow (#1339 — gate external-effect tool calls)
controllers.extend(crate::openhuman::approval::all_approval_registered_controllers());
// Agent-generated artifact storage, retrieval, and lifecycle management
controllers.extend(crate::openhuman::artifacts::all_artifacts_registered_controllers());
// Background heartbeat loop controls
controllers.extend(crate::openhuman::heartbeat::all_heartbeat_registered_controllers());
// Ad-hoc static directory HTTP hosting for local file sharing / previews
Expand Down Expand Up @@ -294,6 +296,7 @@ fn build_declared_controller_schemas() -> Vec<ControllerSchema> {
schemas.extend(crate::openhuman::encryption::all_encryption_controller_schemas());
schemas.extend(crate::openhuman::security::all_security_controller_schemas());
schemas.extend(crate::openhuman::approval::all_approval_controller_schemas());
schemas.extend(crate::openhuman::artifacts::all_artifacts_controller_schemas());
schemas.extend(crate::openhuman::heartbeat::all_heartbeat_controller_schemas());
schemas.extend(crate::openhuman::http_host::all_http_host_controller_schemas());
schemas.extend(crate::openhuman::cost::all_cost_controller_schemas());
Expand Down Expand Up @@ -389,6 +392,7 @@ pub fn rpc_method_name(schema: &ControllerSchema) -> String {
pub fn namespace_description(namespace: &str) -> Option<&'static str> {
match namespace {
"about_app" => Some("Catalog the app's user-facing capabilities and where to find them."),
"ai" => Some("Agent-generated artifact storage, retrieval, and lifecycle management."),
"app_state" => Some("Expose core-owned app shell state for frontend polling."),
"auth" => Some("Manage app session and provider credentials."),
"agent_experience" => Some("Local procedural experience capture and retrieval for agents."),
Expand Down
10 changes: 10 additions & 0 deletions src/openhuman/artifacts/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
pub mod ops;
pub mod schemas;
pub mod store;
pub mod types;

pub use schemas::{
all_controller_schemas as all_artifacts_controller_schemas,
all_registered_controllers as all_artifacts_registered_controllers,
};
pub use types::{ArtifactKind, ArtifactMeta, ArtifactStatus};
114 changes: 114 additions & 0 deletions src/openhuman/artifacts/ops.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
use serde_json::{json, Value};

use crate::openhuman::config::Config;
use crate::rpc::RpcOutcome;

use super::store;

/// Default page size for `ai_list_artifacts`.
const DEFAULT_LIMIT: usize = 50;
/// Maximum page size cap for `ai_list_artifacts`.
const MAX_LIMIT: usize = 200;

/// List artifacts in the workspace with pagination.
///
/// Returns `{ "artifacts": [...], "total": N, "offset": M, "limit": L }`.
pub async fn ai_list_artifacts(
config: &Config,
offset: Option<usize>,
limit: Option<usize>,
) -> Result<RpcOutcome<Value>, String> {
let offset = offset.unwrap_or(0);
let limit = limit.unwrap_or(DEFAULT_LIMIT).min(MAX_LIMIT);
log::debug!(
"[artifacts] ai_list_artifacts: workspace={:?} offset={offset} limit={limit}",
config.workspace_dir
);

let (artifacts, total) = store::list_artifacts(&config.workspace_dir, offset, limit).await?;

log::debug!(
"[artifacts] ai_list_artifacts: returning {} of {total} total",
artifacts.len()
);

let value = json!({
"artifacts": artifacts,
"total": total,
"offset": offset,
"limit": limit,
});
Ok(RpcOutcome::new(value, vec![]))
}

/// Retrieve a single artifact by ID.
///
/// Returns the serialized `ArtifactMeta` plus an `absolute_path` field
/// pointing to the full on-disk location of the artifact files.
pub async fn ai_get_artifact(
config: &Config,
artifact_id: &str,
) -> Result<RpcOutcome<Value>, String> {
log::debug!(
"[artifacts] ai_get_artifact: id={artifact_id} workspace={:?}",
config.workspace_dir
);

if artifact_id.is_empty() {
return Err("[artifacts] artifact_id must not be empty".to_string());
}

let meta = store::get_artifact(&config.workspace_dir, artifact_id).await?;

// Compute absolute path for the caller's convenience.
let absolute_path = config
.workspace_dir
.join("artifacts")
.join(&meta.path)
.to_string_lossy()
.into_owned();
Comment on lines +63 to +69
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden absolute_path derivation against unsafe meta.path.

meta.path is used directly in join(...). If it is absolute (or contains parent traversal), the computed absolute_path can escape the artifacts root.

Suggested fix
+    use std::path::{Component, Path};
+
+    let rel = Path::new(&meta.path);
+    if rel.is_absolute()
+        || rel
+            .components()
+            .any(|c| matches!(c, Component::ParentDir | Component::RootDir | Component::Prefix(_)))
+    {
+        return Err(format!(
+            "[artifacts] invalid artifact path for id={artifact_id}: {:?}",
+            meta.path
+        ));
+    }
+
+    let artifacts_root = config.workspace_dir.join("artifacts");
     let absolute_path = config
-        .workspace_dir
-        .join("artifacts")
+        .workspace_dir
+        .join("artifacts")
         .join(&meta.path)
         .to_string_lossy()
         .into_owned();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Compute absolute path for the caller's convenience.
let absolute_path = config
.workspace_dir
.join("artifacts")
.join(&meta.path)
.to_string_lossy()
.into_owned();
// Compute absolute path for the caller's convenience.
use std::path::{Component, Path};
let rel = Path::new(&meta.path);
if rel.is_absolute()
|| rel
.components()
.any(|c| matches!(c, Component::ParentDir | Component::RootDir | Component::Prefix(_)))
{
return Err(format!(
"[artifacts] invalid artifact path for id={artifact_id}: {:?}",
meta.path
));
}
let artifacts_root = config.workspace_dir.join("artifacts");
let absolute_path = config
.workspace_dir
.join("artifacts")
.join(&meta.path)
.to_string_lossy()
.into_owned();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/artifacts/ops.rs` around lines 63 - 69, The current derivation
of absolute_path uses meta.path directly which allows absolute paths or
parent-traversal to escape the artifacts root; update the logic that builds
absolute_path (the code touching absolute_path, config.workspace_dir, and
meta.path) to first validate/sanitize meta.path: reject or canonicalize if
Path::new(&meta.path).is_absolute() and reject or disallow any
Path::new(&meta.path).components() containing Component::ParentDir, or
alternatively rebuild a safe relative PathBuf by filtering components to only
Normal components before joining; if validation fails return an error instead of
joining, then join the sanitized relative path to
config.workspace_dir.join("artifacts") and convert to string as before.


let mut value =
serde_json::to_value(&meta).map_err(|e| format!("[artifacts] serialization error: {e}"))?;
if let Some(obj) = value.as_object_mut() {
obj.insert(
"absolute_path".to_string(),
Value::String(absolute_path.clone()),
);
}

log::debug!(
"[artifacts] ai_get_artifact: found id={artifact_id} absolute_path={absolute_path}"
);
Ok(RpcOutcome::new(value, vec![]))
}

/// Delete an artifact and all associated files.
///
/// Returns `{ "artifact_id": "...", "deleted": true }`.
pub async fn ai_delete_artifact(
config: &Config,
artifact_id: &str,
) -> Result<RpcOutcome<Value>, String> {
log::debug!(
"[artifacts] ai_delete_artifact: id={artifact_id} workspace={:?}",
config.workspace_dir
);

if artifact_id.is_empty() {
return Err("[artifacts] artifact_id must not be empty".to_string());
}

store::delete_artifact(&config.workspace_dir, artifact_id).await?;

log::debug!("[artifacts] ai_delete_artifact: deleted id={artifact_id}");
let value = json!({
"artifact_id": artifact_id,
"deleted": true,
});
Ok(RpcOutcome::new(value, vec![]))
}

#[cfg(test)]
#[path = "ops_tests.rs"]
mod tests;
46 changes: 46 additions & 0 deletions src/openhuman/artifacts/ops_tests.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
use tempfile::TempDir;

use super::*;
use crate::openhuman::config::Config;

fn test_config(tmp: &TempDir) -> Config {
Config {
workspace_dir: tmp.path().to_path_buf(),
config_path: tmp.path().join("config.toml"),
..Config::default()
}
}

// ── ai_list_artifacts ──────────────────────────────────────────────────────

#[tokio::test]
async fn list_empty() {
let tmp = TempDir::new().unwrap();
let config = test_config(&tmp);
let outcome = ai_list_artifacts(&config, None, None).await.unwrap();
let value = outcome.into_cli_compatible_json().unwrap();
assert_eq!(value["total"], 0);
assert_eq!(value["artifacts"].as_array().unwrap().len(), 0);
assert_eq!(value["offset"], 0);
assert_eq!(value["limit"], DEFAULT_LIMIT as u64);
}

// ── ai_get_artifact ────────────────────────────────────────────────────────

#[tokio::test]
async fn get_missing_id_error() {
let tmp = TempDir::new().unwrap();
let config = test_config(&tmp);
let err = ai_get_artifact(&config, "").await.unwrap_err();
assert!(err.contains("must not be empty"), "unexpected error: {err}");
}

// ── ai_delete_artifact ─────────────────────────────────────────────────────

#[tokio::test]
async fn delete_missing_id_error() {
let tmp = TempDir::new().unwrap();
let config = test_config(&tmp);
let err = ai_delete_artifact(&config, "").await.unwrap_err();
assert!(err.contains("must not be empty"), "unexpected error: {err}");
}
Loading
Loading