Add file policy engine for file tool harness by kunpeng-ai-lab · Pull Request #1974 · Hmbown/CodeWhale

kunpeng-ai-lab · 2026-05-24T05:01:31Z

Summary

Reopens the stale pre-v0.8.41 file policy engine work from #1585 on current main.

This adds a harness-level file access policy that can allow or deny file-touching tool calls by category before mutation/read execution. It is intended as a reusable safety layer for workspace policy, diagnostics, and audit behavior.

Changes

Add file policy config/rule parsing and an example refs/execpolicy.example.toml.
Wire policy loading behind the existing feature/config surface.
Enforce file policy for serial tool execution, parallel tool batches, and execute_tool_with_lock defensive execution.
Extract file targets for read_file, write_file, edit_file, and apply_patch including path override and unified diff headers.
Emit tool.file_policy_denied audit metadata on denied execution.
Add design documentation in docs/file-policy-design.md.
Add regression tests for allow/deny rules, default category fallback, disabled feature behavior, apply_patch denial, and policy checks in engine paths.

Validation

cargo fmt --check
cargo test -p codewhale-tui file_policy (15 passed)
cargo test -p codewhale-tui engine:: (117 passed)
cargo test -p codewhale-tui composer_arrows_scroll (10 passed)
cargo check --workspace

Supersedes #1585, which became stale after the v0.8.41 rebrand.

gemini-code-assist

Code Review

This pull request implements a file access policy engine that enforces path-level restrictions for file-related tools using glob-style patterns defined in execpolicy.toml. The changes include the core evaluation logic with regex caching, integration into both serial and parallel tool execution paths, a new FilePolicy feature flag, and comprehensive documentation. Reviewer feedback identifies a bug in handling filenames with spaces in diff headers and potential false positives when parsing patch contents. Further suggestions include improving error reporting for invalid configurations, reordering policy evaluation to better support whitelisting with exceptions, and extending coverage to tools like list_dir and grep_files.

gemini-code-assist · 2026-05-24T05:04:52Z

+}
+
+fn normalize_diff_path(raw: &str) -> Option<String> {
+    let path = raw.split_whitespace().next()?.trim_matches('"');


Using split_whitespace() to extract the path from a diff header will fail for filenames containing spaces. Unified diff headers typically use a tab character (\t) to separate the filename from the timestamp. Using split_whitespace().next() will truncate the path at the first space, leading to incorrect policy enforcement or potential bypasses for files with spaces in their names.

Suggested change

let path = raw.split_whitespace().next()?.trim_matches('"');

let path = raw.split('\t').next()?.trim_matches('"');

gemini-code-assist · 2026-05-24T05:04:52Z

+        let file_policy = if config.features.enabled(Feature::FilePolicy) {
+            crate::execpolicy::load_default_policy()
+                .ok()
+                .flatten()
+                .map(Arc::new)
+        } else {
+            None
+        };


The policy engine fails silently if the configuration file exists but is invalid (e.g., due to a syntax error). Using .ok() converts the error into None, effectively disabling the safety layer without notifying the user. For a security-related feature, it is better to log a high-visibility error or surface the failure to the user so they are aware that the expected safety guardrails are not active.

gemini-code-assist · 2026-05-24T05:04:52Z

+    pub fn evaluate_file(&self, category: &str, path: &str) -> ExecPolicyDecision {
+        let mut rule_sets = Vec::new();
+        if let Some(rules) = self.file_rules.get(category) {
+            rule_sets.push(rules);
+        }
+        if category != "default"
+            && let Some(rules) = self.file_rules.get("default")
+        {
+            rule_sets.push(rules);
+        }
+
+        if rule_sets.is_empty() {
+            return ExecPolicyDecision::Allow;
+        }
+
+        for rules in &rule_sets {
+            for pattern in &rules.deny {
+                if pattern_matches_path(pattern, path) {
+                    return ExecPolicyDecision::Deny(format!(
+                        "file policy denied by {category}: {pattern}"
+                    ));
+                }
+            }
+        }
+
+        for rules in &rule_sets {
+            for pattern in &rules.allow {
+                if pattern_matches_path(pattern, path) {
+                    return ExecPolicyDecision::Allow;
+                }
+            }
+        }
+
+        ExecPolicyDecision::AskUser(format!(
+            "file policy: no matching allow rule for {category}"
+        ))
+    }


The current implementation checks all deny rules (category then default) before any allow rules. This makes it impossible to create a "whitelist with exceptions" policy. For example, if default has deny = ["*"], no specific allow rule in a category (like read_file) can ever permit a file because the global deny will always trigger first. Reordering the checks to evaluate the specific category (deny then allow) before falling back to the default category would be more flexible and intuitive.

pub fn evaluate_file(&self, category: &str, path: &str) -> ExecPolicyDecision { if let Some(rules) = self.file_rules.get(category) { for pattern in &rules.deny { if pattern_matches_path(pattern, path) { return ExecPolicyDecision::Deny(format!( "file policy denied by {category}: {pattern}" )); } } for pattern in &rules.allow { if pattern_matches_path(pattern, path) { return ExecPolicyDecision::Allow; } } } if category != "default" && let Some(rules) = self.file_rules.get("default") { for pattern in &rules.deny { if pattern_matches_path(pattern, path) { return ExecPolicyDecision::Deny(format!( "file policy denied by default: {pattern}" )); } } for pattern in &rules.allow { if pattern_matches_path(pattern, path) { return ExecPolicyDecision::Allow; } } } if self.file_rules.is_empty() { ExecPolicyDecision::Allow } else { ExecPolicyDecision::AskUser(format!( "file policy: no matching allow rule for {category}" )) } }

gemini-code-assist · 2026-05-24T05:04:52Z

+pub fn is_file_tool(tool_name: &str) -> bool {
+    matches!(
+        tool_name,
+        "read_file" | "write_file" | "edit_file" | "apply_patch"
+    )
+}
+
+pub fn extract_file_path<'a>(
+    tool_input: &'a serde_json::Value,
+    tool_name: &str,
+) -> Option<&'a str> {
+    match tool_name {
+        "read_file" | "write_file" | "edit_file" | "apply_patch" => {
+            tool_input.get("path")?.as_str()
+        }
+        _ => None,
+    }
+}
+
+pub fn extract_file_paths(tool_input: &serde_json::Value, tool_name: &str) -> Vec<String> {
+    match tool_name {
+        "read_file" | "write_file" | "edit_file" => extract_file_path(tool_input, tool_name)
+            .map(|path| vec![path.to_string()])
+            .unwrap_or_default(),
+        "apply_patch" => extract_apply_patch_paths(tool_input),
+        _ => Vec::new(),
+    }
+}


The policy engine currently omits list_dir and grep_files, which can be used to access sensitive information or discover workspace structure. These tools should be included in the file access policy to provide a comprehensive safety layer.

pub fn is_file_tool(tool_name: &str) -> bool { matches!( tool_name, "read_file" | "write_file" | "edit_file" | "apply_patch" | "list_dir" | "grep_files" ) } pub fn extract_file_path<'a>( tool_input: &'a serde_json::Value, tool_name: &str, ) -> Option<&'a str> { match tool_name { "read_file" | "write_file" | "edit_file" | "apply_patch" | "list_dir" | "grep_files" => { tool_input.get("path")?.as_str() } _ => None, } } pub fn extract_file_paths(tool_input: &serde_json::Value, tool_name: &str) -> Vec<String> { match tool_name { "read_file" | "write_file" | "edit_file" | "list_dir" | "grep_files" => { extract_file_path(tool_input, tool_name) .map(|path| vec![path.to_string()]) .unwrap_or_default() } "apply_patch" => extract_apply_patch_paths(tool_input), _ => Vec::new(), } }

gemini-code-assist · 2026-05-24T05:04:52Z

+    for line in patch.lines() {
+        let candidate = line
+            .strip_prefix("+++ ")
+            .or_else(|| line.strip_prefix("--- "));
+        let Some(candidate) = candidate else {
+            continue;
+        };
+        if let Some(path) = normalize_diff_path(candidate) {
+            push_unique_path(&mut paths, path);
+        }
+    }


Scanning the entire patch body for lines starting with --- or +++ can lead to false positives if the file content being patched contains these sequences (e.g., when patching a diff file or documentation about diffs). This could cause the policy engine to block a patch based on text that is actually part of the file content rather than a target path. Consider a more robust approach that only parses headers outside of hunks.

Add file policy engine

782f3a1

gemini-code-assist Bot reviewed May 24, 2026

View reviewed changes

kunpeng-ai-lab mentioned this pull request May 24, 2026

Add file policy engine for file tool harness #1585

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add file policy engine for file tool harness#1974

Add file policy engine for file tool harness#1974
kunpeng-ai-lab wants to merge 1 commit into
Hmbown:mainfrom
kunpeng-ai-lab:cdx/file-policy-engine-v0841

kunpeng-ai-lab commented May 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Uh oh!

gemini-code-assist Bot May 24, 2026

Uh oh!

gemini-code-assist Bot May 24, 2026

Uh oh!

gemini-code-assist Bot May 24, 2026

Uh oh!

gemini-code-assist Bot May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	let path = raw.split_whitespace().next()?.trim_matches('"');
	let path = raw.split('\t').next()?.trim_matches('"');

Conversation

kunpeng-ai-lab commented May 24, 2026

Summary

Changes

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant