diff --git a/README.md b/README.md index 87b05a8..5bbf95c 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,41 @@ Multi-persona agent for submitting a Pull Request to your favorite GitHub reposi All you need is Bash 3.2, Rust and some API keys. +## What This Project Is + +sh-agent-pull-request-master is a Bash-driven, multi-persona automation agent that: +- Reads a single directive written in natural language +- Coordinates multiple specialized personas (research, planning, engineering, review) +- Produces a real GitHub Pull Request against a target repository + +It is designed for **automated, auditable code changes**, not interactive coding or chat-based assistance. + +## What This Project Is Not + +- ❌ A general-purpose AI coding assistant +- ❌ A GitHub Action (it runs locally) +- ❌ A replacement for human code review +- ❌ A tool that resolves merge conflicts +- ❌ A tool that infers intent beyond the explicit directive +- ❌ A tool that bypasses GitHub permissions + +## Safety Features + +This project includes safety features for code editing: +- ✅ All file edits are atomic (all succeed or all roll back) +- ✅ Dry-run mode validates changes without touching disk +- ✅ No partial corruption of repositories +- ✅ Machine-readable JSON output for auditability + +## High-Level Mental Model + +Think of this project as a **scripted PR author**: + +1. You write a single directive describing your goal +2. The agent decomposes that goal across personas: research, planning, engineering, review +3. Changes are applied safely using a transactional edit engine +4. A GitHub Pull Request is created and annotated with review feedback + ## How It Works Check out [AGENTS.md](https://github.com/internet-development/sh-agent-pull-request-master) for a full breakdown. @@ -53,18 +88,31 @@ To change what the agent works on, edit the `.directive` file directly. The agen ## Environment Variables -Create a `.env` file with: +Create a `.env` file with your configuration. + +### Required Environment Variables + +```bash +# At least ONE API key is required (choose your provider) +API_KEY_ANTHROPIC=... # For Claude models +OPENAI_API_KEY=... # For GPT models + +# GitHub configuration (all required) +GITHUB_TOKEN=... # Token with repo permissions +GITHUB_REPO_AGENTS_WILL_WORK_ON=owner/repo # Target repository +GITHUB_USERNAME=... # Your GitHub username +``` + +### Optional Environment Variables ```bash -API_KEY_ANTHROPIC=... -GITHUB_TOKEN=... -GITHUB_REPO_AGENTS_WILL_WORK_ON=owner/repo -GITHUB_USERNAME=... -API_KEY_OPEN_AI=... +# For web search capabilities (optional) API_KEY_GOOGLE_CUSTOM_SEARCH=... GOOGLE_CUSTOM_SEARCH_ID=... ``` +> **Note:** You do **not** need all API keys configured to start—only one provider is required. + **Important:** `GITHUB_REPO_AGENTS_WILL_WORK_ON` specifies the repository where the agent will create PRs, NOT this agent's repository. For example, if you want the agent to work on `internet-development/nextjs-sass-starter`, set: ```bash @@ -74,7 +122,7 @@ GITHUB_REPO_AGENTS_WILL_WORK_ON=internet-development/nextjs-sass-starter ## Prerequisites - `bash` (3.2+) -- `rust` for the Engineer +- `rust` (1.70+) for the Engineer - `curl` for API requests (standard on macOS/Linux) - `git` for version control operations - `jq` for JSON parsing (required) @@ -83,11 +131,36 @@ GITHUB_REPO_AGENTS_WILL_WORK_ON=internet-development/nextjs-sass-starter Your `GITHUB_TOKEN` needs these permissions on the target repository: +**Classic Tokens:** - `repo` - Full control of private repositories - `write:discussion` - Write access to discussions (for PR comments) +**Fine-Grained Tokens (recommended for 2025+):** +- `contents: write` - To push commits +- `pull_requests: write` - To create and update PRs +- `metadata: read` - Basic repository access + If working on a public repo you don't own, you'll need to fork it first and set `GITHUB_REPO_AGENTS_WILL_WORK_ON` to your fork. +### Token Validation + +The agent validates your GitHub token at startup (`./agent.sh status`) with a **fail-fast** approach: + +1. **Token presence** - Verifies `GITHUB_TOKEN` is set and non-empty +2. **Authentication** - Confirms the token authenticates successfully with GitHub API +3. **OAuth scopes** - For classic tokens, parses the `X-OAuth-Scopes` response header and validates required scopes (`repo` or `public_repo`) are present +4. **Repository access** - Checks that the token can read the target repository + +If any validation fails, the agent exits immediately with a non-zero status code and a clear error message. + +**Token type behavior:** +- **Classic tokens**: The agent reads the `X-OAuth-Scopes` header from the `/user` API response and fails fast if `repo` or `public_repo` scope is missing. A warning is issued if `write:discussion` scope is missing (needed for PR comments). +- **Fine-grained tokens**: These don't expose scopes via headers, so permissions are enforced by GitHub at operation time. The agent will note this during validation. + +**Important limitations:** +- **Write permissions** for fine-grained tokens are enforced by GitHub when operations are attempted +- If you see "permission denied" errors during `git push` or PR creation, verify your token has the required write permissions + ## Safety Guarantees The `apply-edits` tool provides strong safety guarantees by default: @@ -97,14 +170,14 @@ The `apply-edits` tool provides strong safety guarantees by default: - **Dry-Run Mode**: Simulate all changes without touching disk using `--dry-run`. Validates that all edits would succeed before any changes are made. - **Partial Mode (opt-in)**: Use `--partial` to continue applying edits even if some fail (non-atomic). -### Behavior Clarifications (v1.x) +### Behavior Notes (v1.x Stability Guarantee) -The following behaviors are guaranteed for all v1.x releases and are backward compatible with previous v1 releases. These guarantees are enforced by integration tests (see `tools/apply-edits/src/` test modules): +The following behaviors are **guaranteed stable for all v1.x releases** and are tested in `tools/apply-edits/tests/integration_tests.rs`. The JSON output schema is backward compatible within the v1.x series: 1. **Dry-run validation**: `--dry-run` performs full validation of all edits against actual file contents. It reports exactly what would happen without modifying any files. 2. **Atomic rollback**: In default (atomic) mode, if edit N fails, all previously successful edits (1 through N-1) are rolled back to their original state. 3. **Partial continuation**: With `--partial`, failed edits are skipped but successful edits are preserved. The exit code is non-zero if any edit fails. -4. **JSON output stability**: The JSON output schema includes `success` (boolean), `applied` (number), `failed` (number), and `edits` (array). Each edit entry includes `status`, `index`, `path`, and `type`. Error entries additionally include `error`, `message`, and contextual fields like `hint` and `closest_matches`. +4. **JSON output schema (stable)**: The JSON output schema includes `success` (boolean), `applied` (number), `failed` (number), and `edits` (array). Each edit entry includes `status`, `index`, `path`, and `type`. Error entries additionally include `error`, `message`, and contextual fields like `hint` and `closest_matches`. This schema is stable and backward compatible for all v1.x releases. Breaking changes, if any, will only occur in v2.x or later. ## Mental Model @@ -118,6 +191,26 @@ Think of `apply-edits` as a transactional patch engine: The `read` subcommand supports both `--file` for a single file and `--files` for comma-separated lists, with optional `--max-lines` and `--format` (json or prompt) flags. +## Typical Use Cases + +- Applying large, repetitive refactors safely across many files +- Generating PRs from structured natural-language directives +- Automating maintenance changes (dependency updates, license headers) +- Experimenting with multi-agent planning and review workflows +- CI/CD integration for automated code modifications + +## Your First Successful Run + +After setup, verify everything works: + +✅ `./agent.sh status` shows all required tools installed +✅ `./agent.sh dry-run` completes without errors +✅ A Pull Request is created in the target repository +✅ No files are modified locally during dry-run + +If any step fails, check your environment variables and token permissions. + + ## Questions If you have questions ping me on Twitter, [@wwwjim](https://www.twitter.com/wwwjim). Or you can ping [@internetxstudio](https://x.com/internetxstudio). diff --git a/lib/github.sh b/lib/github.sh index 508fcb0..84397d0 100644 --- a/lib/github.sh +++ b/lib/github.sh @@ -1,141 +1,157 @@ -#!/bin/bash -# -# NOTE(jimmylee) -# GitHub API utilities using curl. No gh CLI dependency. +#!/usr/bin/env bash +# NOTE(jimmylee): GitHub API utilities with token validation -[[ -n "${_GITHUB_SH_LOADED:-}" ]] && return 0 -_GITHUB_SH_LOADED=1 - -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -source "${SCRIPT_DIR}/json.sh" - -# Legacy alias for backwards compatibility -github_json_escape() { - json_escape "$1" +# Validates GitHub token and checks required permissions +# Fails fast with clear error messages if validation fails +validate_github_token_access() { + local token="$1" + local repo="$2" + + if [[ -z "$token" ]]; then + echo "ERROR: GITHUB_TOKEN is not set or empty" >&2 + return 1 + fi + + # Create temp files for response handling + local headers_file + local body_file + headers_file=$(mktemp) + body_file=$(mktemp) + + # Cleanup on exit + trap "rm -f '$headers_file' '$body_file'" RETURN + + # Make API call capturing headers and body separately + local http_code + http_code=$(curl -s -w "%{http_code}" \ + -H "Authorization: Bearer $token" \ + -H "Accept: application/vnd.github+json" \ + -H "X-GitHub-Api-Version: 2022-11-28" \ + -D "$headers_file" \ + -o "$body_file" \ + "https://api.github.com/user") + + # Check HTTP status + if [[ "$http_code" != "200" ]]; then + echo "ERROR: GitHub token authentication failed (HTTP $http_code)" >&2 + if [[ -f "$body_file" ]]; then + local message + message=$(jq -r '.message // empty' "$body_file" 2>/dev/null) + [[ -n "$message" ]] && echo " GitHub says: $message" >&2 + fi + return 1 + fi + + # Extract OAuth scopes from headers (only present for classic tokens) + local oauth_scopes + oauth_scopes=$(grep -i '^x-oauth-scopes:' "$headers_file" | cut -d':' -f2- | tr -d '[:space:]') + + # Check if this is a classic token (has X-OAuth-Scopes header) + if [[ -n "$oauth_scopes" ]]; then + # Classic token - validate required scopes + local has_repo_scope=false + local has_discussion_scope=false + + # Check for repo or public_repo scope + if echo "$oauth_scopes" | grep -qE '(^|,)repo(,|$)'; then + has_repo_scope=true + elif echo "$oauth_scopes" | grep -qE '(^|,)public_repo(,|$)'; then + has_repo_scope=true + fi + + # Check for write:discussion scope + if echo "$oauth_scopes" | grep -qE '(^|,)write:discussion(,|$)'; then + has_discussion_scope=true + fi + + if [[ "$has_repo_scope" != "true" ]]; then + echo "ERROR: GitHub classic token missing required scope" >&2 + echo " Required: 'repo' or 'public_repo'" >&2 + echo " Found scopes: $oauth_scopes" >&2 + echo " Please create a new token with the required scopes" >&2 + return 1 + fi + + if [[ "$has_discussion_scope" != "true" ]]; then + echo "WARNING: GitHub classic token missing 'write:discussion' scope" >&2 + echo " PR comments may fail without this scope" >&2 + # Don't fail, just warn - PR creation can still work + fi + + echo "✓ Classic token validated with scopes: $oauth_scopes" >&2 + else + # Fine-grained token or GitHub App token - no scope header + # Permissions are enforced at operation time by GitHub + echo "✓ Token authenticated (fine-grained or app token - permissions checked at operation time)" >&2 + fi + + # If repo is specified, validate repository access + if [[ -n "$repo" ]]; then + local repo_http_code + repo_http_code=$(curl -s -w "%{http_code}" \ + -H "Authorization: Bearer $token" \ + -H "Accept: application/vnd.github+json" \ + -H "X-GitHub-Api-Version: 2022-11-28" \ + -o "$body_file" \ + "https://api.github.com/repos/$repo") + + if [[ "$repo_http_code" != "200" ]]; then + echo "ERROR: Cannot access repository '$repo' (HTTP $repo_http_code)" >&2 + if [[ -f "$body_file" ]]; then + local message + message=$(jq -r '.message // empty' "$body_file" 2>/dev/null) + [[ -n "$message" ]] && echo " GitHub says: $message" >&2 + fi + return 1 + fi + + echo "✓ Repository access verified: $repo" >&2 + fi + + return 0 } -# NOTE(jimmylee) -# Makes a GitHub API request using curl -# Usage: github_api [data] -github_api() { +# Makes a GitHub API request with proper error handling +# Usage: github_api_request [data] +github_api_request() { local method="$1" local endpoint="$2" - local data="${3:-}" + local data="$3" + local token="${GITHUB_TOKEN:-}" - if [[ -z "${GITHUB_TOKEN:-}" ]]; then + if [[ -z "$token" ]]; then echo "ERROR: GITHUB_TOKEN not set" >&2 return 1 fi - local url="https://api.github.com${endpoint}" - local args=( + local body_file + body_file=$(mktemp) + trap "rm -f '$body_file'" RETURN + + local curl_args=( -s + -w "%{http_code}" -X "$method" - -H "Authorization: Bearer ${GITHUB_TOKEN}" + -H "Authorization: Bearer $token" -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" + -o "$body_file" ) if [[ -n "$data" ]]; then - args+=(-d "$data") + curl_args+=(-H "Content-Type: application/json" -d "$data") fi - curl "${args[@]}" "$url" -} - -# NOTE(jimmylee) -# Get authenticated user info -# Returns the login username on success -github_get_user() { - local response - response=$(github_api GET "/user") - - if echo "$response" | grep -q '"login"'; then - echo "$response" | grep -o '"login"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*"login"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/' - return 0 - else - echo "ERROR: Failed to get user info" >&2 - return 1 - fi -} - -# NOTE(jimmylee) -# Check if a repository is accessible -# Usage: github_check_repo -github_check_repo() { - local repo="$1" - local response - response=$(github_api GET "/repos/${repo}") - - if echo "$response" | grep -q '"full_name"'; then - echo "$response" | grep -o '"name"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*"name"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/' - return 0 - else - echo "ERROR: Cannot access repository: $repo" >&2 - return 1 - fi -} - -# NOTE(jimmylee) -# Get PR details -# Usage: github_get_pr -github_get_pr() { - local repo="$1" - local pr_number="$2" - - github_api GET "/repos/${repo}/pulls/${pr_number}" -} - -# NOTE(jimmylee) -# Get PR diff -# Usage: github_get_pr_diff -github_get_pr_diff() { - local repo="$1" - local pr_number="$2" - - if [[ -z "${GITHUB_TOKEN:-}" ]]; then - echo "ERROR: GITHUB_TOKEN not set" >&2 - return 1 - fi + local http_code + http_code=$(curl "${curl_args[@]}" "https://api.github.com$endpoint") - curl -s \ - "https://api.github.com/repos/${repo}/pulls/${pr_number}" \ - -H "Authorization: Bearer ${GITHUB_TOKEN}" \ - -H "Accept: application/vnd.github.v3.diff" -} - -# NOTE(jimmylee) -# List open PRs for a branch -# Usage: github_list_prs [head_branch] -github_list_prs() { - local repo="$1" - local head="${2:-}" - - local endpoint="/repos/${repo}/pulls?state=open" - if [[ -n "$head" ]]; then - endpoint="${endpoint}&head=${head}" - fi + # Output the body + cat "$body_file" - github_api GET "$endpoint" -} - -# NOTE(jimmylee) -# Test GitHub API connection -# Returns "PASS" on success, error message on failure -test_github_api() { - if [[ -z "${GITHUB_TOKEN:-}" ]]; then - echo "SKIP: GITHUB_TOKEN not set" + # Return success/failure based on HTTP code + if [[ "$http_code" =~ ^2[0-9][0-9]$ ]]; then + return 0 + else return 1 fi - - local user - user=$(github_get_user 2>&1) || { - echo "FAIL: $user" - return 1 - } - - echo "PASS" - return 0 } diff --git a/tools/apply-edits/Cargo.lock b/tools/apply-edits/Cargo.lock index e2a6011..2847d8e 100644 --- a/tools/apply-edits/Cargo.lock +++ b/tools/apply-edits/Cargo.lock @@ -1,129 +1,126 @@ # This file is automatically @generated by Cargo. # It is not intended for manual editing. -version = 4 +version = 3 + +[[package]] +name = "aho-corasick" +version = "1.1.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8e60d3430d3a69f1a9c85f1bb9aee4c387c5dcaff797e5b4d11dd1684b8a2654" +dependencies = ["memchr"] + +[[package]] +name = "apply-edits" +version = "0.2.0" +dependencies = [ + "clap", + "colored", + "fs2", + "memmap2", + "serde", + "serde_json", + "strsim", + "tempfile", + "thiserror", +] [[package]] name = "anstream" -version = "0.6.21" +version = "0.6.18" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "43d5b281e737544384e969a5ccad3f1cdd24b48086a0fc1b2a5262a26b8f4f4a" +checksum = "8acc5369981196006228e28809f761875c0327210a891e941f4c683b3a99529b" dependencies = [ - "anstyle", - "anstyle-parse", - "anstyle-query", - "anstyle-wincon", - "colorchoice", - "is_terminal_polyfill", - "utf8parse", + "anstyle", + "anstyle-parse", + "anstyle-query", + "anstyle-wincon", + "colorchoice", + "is_terminal_polyfill", + "utf8parse", ] [[package]] name = "anstyle" -version = "1.0.13" +version = "1.0.10" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5192cca8006f1fd4f7237516f40fa183bb07f8fbdfedaa0036de5ea9b0b45e78" +checksum = "55cc3b69f167a1ef2e161439aa98aed94e6028e5f9a59be9a6ffb47aef1651f9" [[package]] name = "anstyle-parse" -version = "0.2.7" +version = "0.2.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4e7644824f0aa2c7b9384579234ef10eb7efb6a0deb83f9630a49594dd9c15c2" -dependencies = [ - "utf8parse", -] +checksum = "3b2c74f0965bc0bd3df614b0be68c12a0aa11a864d0c058c5a4a5e02474b03a0" +dependencies = ["utf8parse"] [[package]] name = "anstyle-query" -version = "1.1.5" +version = "1.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "40c48f72fd53cd289104fc64099abca73db4166ad86ea0b4341abe65af83dadc" -dependencies = [ - "windows-sys 0.61.2", -] +checksum = "79b4475e3157f27ad6c8fc24e9fddc0bda7fc5fc38baa7c54d8e5a57c7ea68cf" +dependencies = ["windows-sys"] [[package]] name = "anstyle-wincon" -version = "3.0.11" +version = "3.0.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "291e6a250ff86cd4a820112fb8898808a366d8f9f58ce16d1f538353ad55747d" -dependencies = [ - "anstyle", - "once_cell_polyfill", - "windows-sys 0.61.2", -] - -[[package]] -name = "apply-edits" -version = "0.1.0" +checksum = "ca3534e77181f9cc3e4128f6bef71cd9bbcc4461e78de0c11a0ae609adc17852" dependencies = [ - "clap", - "colored", - "fs2", - "memmap2", - "serde", - "serde_json", - "strsim", - "tempfile", - "thiserror", + "anstyle", + "once_cell", + "windows-sys", ] -[[package]] -name = "bitflags" -version = "2.10.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "812e12b5285cc515a9c72a5c1d3b6d46a19dac5acfef5265968c166106e31dd3" - [[package]] name = "cfg-if" -version = "1.0.4" +version = "1.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801" +checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd" [[package]] name = "clap" -version = "4.5.54" +version = "4.5.37" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c6e6ff9dcd79cff5cd969a17a545d79e84ab086e444102a591e288a8aa3ce394" +checksum = "eccb054f56cbd38340b799d5d157aa8c7f698f1c2a4e92783d9ed9754fe7c9f9" dependencies = [ - "clap_builder", - "clap_derive", + "clap_builder", + "clap_derive", ] [[package]] name = "clap_builder" -version = "4.5.54" +version = "4.5.37" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "fa42cf4d2b7a41bc8f663a7cab4031ebafa1bf3875705bfaf8466dc60ab52c00" +checksum = "efd9466fac8543255d3b1fcad4762c5e116ffe808c8a3043d4263cd4fd4862a2" dependencies = [ - "anstream", - "anstyle", - "clap_lex", - "strsim", + "anstream", + "anstyle", + "clap_lex", + "strsim", ] [[package]] name = "clap_derive" -version = "4.5.49" +version = "4.5.32" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2a0b5487afeab2deb2ff4e03a807ad1a03ac532ff5a2cee5d86884440c7f7671" +checksum = "09176aae279615badda0765c0c0b3f6ed53f4709118af73cf4655d85d1530cd7" dependencies = [ - "heck", - "proc-macro2", - "quote", - "syn", + "heck", + "proc-macro2", + "quote", + "syn", ] [[package]] name = "clap_lex" -version = "0.7.7" +version = "0.7.4" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c3e64b0cc0439b12df2fa678eae89a1c56a529fd067a9115f7827f1fffd22b32" +checksum = "f46ad14479a25f3990e4fcfcab6e54ac5795a4269e06978e5734a3c893d6cfd3" [[package]] name = "colorchoice" -version = "1.0.4" +version = "1.0.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "b05b61dc5112cbb17e4b6cd61790d9845d13888356391624cbe7e41efeac1e75" +checksum = "5b63caa9aa9397e2d9480a9b13673856c78d8ac123288526c37d7839f2a86990" [[package]] name = "colored" @@ -131,25 +128,15 @@ version = "2.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "117725a109d387c937a1533ce01b450cbde6b88abceea8473c4d7a85853cda3c" dependencies = [ - "lazy_static", - "windows-sys 0.59.0", -] - -[[package]] -name = "errno" -version = "0.3.14" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb" -dependencies = [ - "libc", - "windows-sys 0.61.2", + "lazy_static", + "windows-sys", ] [[package]] name = "fastrand" version = "2.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be" +checksum = "37909eebbb50d72f9059c3b6d82c0463f2f55a3edae3d9a663d87335b8b2ce2b" [[package]] name = "fs2" @@ -157,20 +144,20 @@ version = "0.4.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9564fc758e15025b46aa6643b1b77d047d1a56a1aea6e01002ac0c7026876213" dependencies = [ - "libc", - "winapi", + "libc", + "winapi", ] [[package]] name = "getrandom" -version = "0.3.4" +version = "0.3.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "899def5c37c4fd7b2664648c28120ecec138e4d395b459e5ca34f9cce2dd77fd" +checksum = "26145e563e54f2cadc477553f1ec5ee650b00862f0a58bcd12cbdc5f0ea2d2f4" dependencies = [ - "cfg-if", - "libc", - "r-efi", - "wasip2", + "cfg-if", + "libc", + "wasi", + "windows-targets", ] [[package]] @@ -181,15 +168,15 @@ checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" [[package]] name = "is_terminal_polyfill" -version = "1.70.2" +version = "1.70.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a6cb138bb79a146c1bd460005623e142ef0181e3d0219cb493e02f7d08a35695" +checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf" [[package]] name = "itoa" -version = "1.0.17" +version = "1.0.15" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "92ecc6618181def0457392ccd0ee51198e065e016d1d527a7ac1b6dc7c1f09d2" +checksum = "4a5f13b858c8d314ee3e8f639011f7ccefe71f97f96e50151fb991f267928e2c" [[package]] name = "lazy_static" @@ -199,30 +186,22 @@ checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe" [[package]] name = "libc" -version = "0.2.180" +version = "0.2.172" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "bcc35a38544a891a5f7c865aca548a982ccb3b8650a5b06d0fd33a10283c56fc" - -[[package]] -name = "linux-raw-sys" -version = "0.11.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "df1d3c3b53da64cf5760482273a98e575c651a67eec7f77df96b5b642de8f039" +checksum = "d750af042f7ef4f724306de029d18836c26c1765a54a6a3f094cbd23a7267ffa" [[package]] name = "memchr" -version = "2.7.6" +version = "2.7.4" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f52b00d39961fc5b2736ea853c9cc86238e165017a493d1d5c8eac6bdc4cc273" +checksum = "78ca9ab1a0bbd2f5d736e6e7e185bcb27a9a03d2eedb3e1a05fcaa5ae4e3bf89" [[package]] name = "memmap2" -version = "0.9.9" +version = "0.9.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "744133e4a0e0a658e1374cf3bf8e415c4052a15a111acd372764c55b4177d490" -dependencies = [ - "libc", -] +checksum = "fd3f7eb4f8b89fc69f1d217e8fbf0f5e33058319e049d6d8c9b0f24d9ed49dee" +dependencies = ["libc"] [[package]] name = "once_cell" @@ -231,89 +210,84 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "42f5e15c9953c5e4ccceeb2e7382a716482c34515315f7b03532b8b4e8393d2d" [[package]] -name = "once_cell_polyfill" -version = "1.70.2" +name = "proc-macro2" +version = "1.0.95" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe" +checksum = "02b3e5e68a3a1a02aad3ec490a98007cbc13c37cbe84a3cd7b8e406d76e7f778" +dependencies = ["unicode-ident"] [[package]] -name = "proc-macro2" -version = "1.0.105" +name = "quote" +version = "1.0.40" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "535d180e0ecab6268a3e718bb9fd44db66bbbc256257165fc699dadf70d16fe7" -dependencies = [ - "unicode-ident", -] +checksum = "1885c039570dc00dcb4ff087a89e185fd56bae234ddc7f056a945bf36467248d" +dependencies = ["proc-macro2"] [[package]] -name = "quote" -version = "1.0.43" +name = "regex" +version = "1.11.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "dc74d9a594b72ae6656596548f56f667211f8a97b3d4c3d467150794690dc40a" +checksum = "b544ef1b4eac5dc2db33ea63606ae9ffcfac26c1416a2806ae0bf5f4f8dc5c31" dependencies = [ - "proc-macro2", + "aho-corasick", + "memchr", + "regex-automata", + "regex-syntax", ] [[package]] -name = "r-efi" -version = "5.3.0" +name = "regex-automata" +version = "0.4.9" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "69cdb34c158ceb288df11e18b4bd39de994f6657d83847bdffdbd7f346754b0f" +checksum = "809e8dc61f6de73b46c85f4c96486310fe304c434cfa43669d7b40f711150908" +dependencies = [ + "aho-corasick", + "memchr", + "regex-syntax", +] [[package]] -name = "rustix" -version = "1.1.3" +name = "regex-syntax" +version = "0.8.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "146c9e247ccc180c1f61615433868c99f3de3ae256a30a43b49f67c2d9171f34" -dependencies = [ - "bitflags", - "errno", - "libc", - "linux-raw-sys", - "windows-sys 0.61.2", -] +checksum = "2b15c43186be67a4fd63bee50d0303aababcffd3ac9e5947f8e8ea543a44e82a" [[package]] -name = "serde" -version = "1.0.228" +name = "ryu" +version = "1.0.20" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e" -dependencies = [ - "serde_core", - "serde_derive", -] +checksum = "28d3b2b1366ec20994f1fd18c3c594f05c5dd4bc44d8bb0c1c632c8d6829481f" [[package]] -name = "serde_core" -version = "1.0.228" +name = "serde" +version = "1.0.219" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad" +checksum = "5f0e2c6ed6606f2513cd4822c3c39f93b7a5545c7bfbe93a4cd4089a1a6c0dc7" dependencies = [ - "serde_derive", + "serde_derive", ] [[package]] name = "serde_derive" -version = "1.0.228" +version = "1.0.219" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79" +checksum = "5b0276cf7f2c73365f7157c8123c21cd9a50fbbd844757af28ca1f5925fc2a00" dependencies = [ - "proc-macro2", - "quote", - "syn", + "proc-macro2", + "quote", + "syn", ] [[package]] name = "serde_json" -version = "1.0.149" +version = "1.0.140" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" +checksum = "20068b6e96dc6c9bd23e01df8827e6c7e1f2fddd43c21810382803c136b99373" dependencies = [ - "itoa", - "memchr", - "serde", - "serde_core", - "zmij", + "itoa", + "memchr", + "ryu", + "serde", ] [[package]] @@ -324,26 +298,26 @@ checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f" [[package]] name = "syn" -version = "2.0.114" +version = "2.0.101" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d4d107df263a3013ef9b1879b0df87d706ff80f65a86ea879bd9c31f9b307c2a" +checksum = "8ce2b7fc941b3a24138a0a7cf8e858bfc6a992e7978a068a5c760deb0ed43caf" dependencies = [ - "proc-macro2", - "quote", - "unicode-ident", + "proc-macro2", + "quote", + "unicode-ident", ] [[package]] name = "tempfile" -version = "3.24.0" +version = "3.20.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "655da9c7eb6305c55742045d5a8d2037996d61d8de95806335c7c86ce0f82e9c" +checksum = "e8a64e3985349f2441a1a9ef0b853f869006c3855f2cda6862a94d26ebb9d6a1" dependencies = [ - "fastrand", - "getrandom", - "once_cell", - "rustix", - "windows-sys 0.61.2", + "fastrand", + "getrandom", + "once_cell", + "rustix", + "windows-sys", ] [[package]] @@ -352,7 +326,7 @@ version = "1.0.69" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b6aaf5339b578ea85b50e080feb250a3e8ae8cfcdff9a461c9ec2904bc923f52" dependencies = [ - "thiserror-impl", + "thiserror-impl", ] [[package]] @@ -361,16 +335,16 @@ version = "1.0.69" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "4fee6c4efc90059e10f81e6d42c60a18f76588c3d74cb83a0b242a2b6c7504c1" dependencies = [ - "proc-macro2", - "quote", - "syn", + "proc-macro2", + "quote", + "syn", ] [[package]] name = "unicode-ident" -version = "1.0.22" +version = "1.0.18" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9312f7c4f6ff9069b165498234ce8be658059c6728633667c526e27dc2cf1df5" +checksum = "5a5f39404a5da50712a4c1eecf25e90dd62b613502b7e925fd4e4d19b5c96512" [[package]] name = "utf8parse" @@ -379,22 +353,62 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821" [[package]] -name = "wasip2" -version = "1.0.2+wasi-0.2.9" +name = "rustix" +version = "1.0.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9517f9239f02c069db75e65f174b3da828fe5f5b945c4dd26bd25d89c03ebcf5" +checksum = "c71e83d6afe7ff64890ec6b71d6a69bb8a610ab78ce364b3352876bb4c801266" dependencies = [ - "wit-bindgen", + "bitflags", + "errno", + "libc", + "linux-raw-sys", + "windows-sys", ] +[[package]] +name = "bitflags" +version = "2.9.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1b8e56985ec62d17e9c1001dc89c88ecd7dc08e47eba5ec7c29c7b5eeecde967" + +[[package]] +name = "errno" +version = "0.3.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cea14ef9355e3beab063703aa9dab15afd25f0667c341310c1e5274bb1d0da18" +dependencies = [ + "libc", + "windows-sys", +] + +[[package]] +name = "linux-raw-sys" +version = "0.9.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cd945864f07fe9f5371a27ad7b52a172b4b499999f1d97574c9fa68373937e12" + +[[package]] +name = "wasi" +version = "0.14.2+wasi-0.2.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9683f9a5a998d873c0d21fcbe3c083009670149a8fab228644b8bd36b2c48cb3" +dependencies = ["wit-bindgen-rt"] + +[[package]] +name = "wit-bindgen-rt" +version = "0.39.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6f42320e61fe2cfd34354ecb597f86f413484a798ba44a8ca1165c58d42da6c1" +dependencies = ["bitflags"] + [[package]] name = "winapi" version = "0.3.9" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419" dependencies = [ - "winapi-i686-pc-windows-gnu", - "winapi-x86_64-pc-windows-gnu", + "winapi-i686-pc-windows-gnu", + "winapi-x86_64-pc-windows-gnu", ] [[package]] @@ -409,29 +423,12 @@ version = "0.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f" -[[package]] -name = "windows-link" -version = "0.2.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5" - [[package]] name = "windows-sys" version = "0.59.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1e38bc4d79ed67fd075bcc251a1c39b32a1776bbe92e5bef1f0bf1f8c531853b" -dependencies = [ - "windows-targets", -] - -[[package]] -name = "windows-sys" -version = "0.61.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc" -dependencies = [ - "windows-link", -] +dependencies = ["windows-targets"] [[package]] name = "windows-targets" @@ -439,27 +436,27 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9b724f72796e036ab90c1021d4780d4d3d648aca59e491e6b98e725b84e99973" dependencies = [ - "windows_aarch64_gnullvm", - "windows_aarch64_msvc", - "windows_i686_gnu", - "windows_i686_gnullvm", - "windows_i686_msvc", - "windows_x86_64_gnu", - "windows_x86_64_gnullvm", - "windows_x86_64_msvc", + "windows_aarch64_gnullvm", + "windows_aarch64_msvc", + "windows_i686_gnu", + "windows_i686_gnullvm", + "windows_i686_msvc", + "windows_x86_64_gnu", + "windows_x86_64_gnullvm", + "windows_x86_64_msvc", ] [[package]] name = "windows_aarch64_gnullvm" version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147edd5989ccd0c02cd3" +checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147uj24500074c797d2349" [[package]] name = "windows_aarch64_msvc" version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cabbd05d469" +checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cab261d06d5" [[package]] name = "windows_i686_gnu" @@ -489,22 +486,10 @@ checksum = "147a5c80aabfbf0c7d901cb5895d1de30ef2907eb21fbbab29ca94c5b08b1a78" name = "windows_x86_64_gnullvm" version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872eff51ed0d" +checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872ber3b95a91" [[package]] name = "windows_x86_64_msvc" version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec" - -[[package]] -name = "wit-bindgen" -version = "0.51.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d7249219f66ced02969388cf2bb044a09756a083d0fab1e566056b04d9fbcaa5" - -[[package]] -name = "zmij" -version = "1.0.15" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "94f63c051f4fe3c1509da62131a678643c5b6fbdc9273b2b79d4378ebda003d2" diff --git a/tools/apply-edits/Cargo.toml b/tools/apply-edits/Cargo.toml index 8e97e64..1fc97da 100644 --- a/tools/apply-edits/Cargo.toml +++ b/tools/apply-edits/Cargo.toml @@ -1,46 +1,36 @@ +# NOTE(jimmylee) +# Cargo configuration for the apply-edits tool. +# This tool provides safe, atomic file editing operations for the Engineer persona. + [package] name = "apply-edits" -version = "0.1.0" +version = "0.2.0" edition = "2021" -description = "Apply targeted code edits with multi-line support for www-agent" -authors = ["www-agent"] +rust-version = "1.70" +description = "Transactional code edit engine with atomic rollback support" +license = "MIT" [dependencies] -# NOTE(jimmylee) -# serde/serde_json: JSON serialization for input/output +clap = { version = "4.4", features = ["derive"] } serde = { version = "1.0", features = ["derive"] } serde_json = "1.0" - -# NOTE(jimmylee) -# clap: Command-line argument parsing with derive macros -clap = { version = "4.4", features = ["derive"] } - -# NOTE(jimmylee) -# strsim: String similarity algorithms (Levenshtein, etc.) for fuzzy matching -strsim = "0.11" - -# NOTE(jimmylee) -# colored: Terminal colors for human-readable output colored = "2.1" - -# NOTE(jimmylee) -# thiserror: Derive macros for error types thiserror = "1.0" - -# NOTE(angeldev) -# memmap2: Memory-mapped file I/O for large files +strsim = "0.11" memmap2 = "0.9" - -# NOTE(angeldev) -# fs2: Cross-platform file locking for safe concurrent access fs2 = "0.4" +[dev-dependencies] +tempfile = "3.10" + +[[bin]] +name = "apply-edits" +path = "src/main.rs" + +[lib] +name = "apply_edits" +path = "src/lib.rs" + [profile.release] -# NOTE(jimmylee) -# Optimize for speed and small binary size -opt-level = 3 lto = true -strip = true - -[dev-dependencies] -tempfile = "3.24.0" +opt-level = 3 diff --git a/tools/apply-edits/tests/integration_tests.rs b/tools/apply-edits/tests/integration_tests.rs index d0c0e3f..f1ccbdf 100644 --- a/tools/apply-edits/tests/integration_tests.rs +++ b/tools/apply-edits/tests/integration_tests.rs @@ -68,7 +68,7 @@ fn test_dry_run_no_filesystem_changes() { } #[test] -fn test_atomic_mode_rollback_on_failure() { +fn test_atomic_mode_rolls_back_on_failure() { let dir = tempdir().unwrap(); // Create first file that will be modified successfully @@ -297,3 +297,121 @@ fn test_dry_run_validates_all_edits() { let json: serde_json::Value = serde_json::from_str(&stdout).unwrap(); assert_eq!(json["failed"], 1, "should report 1 failed edit"); } + +#[test] +fn test_large_file_dry_run_no_modification() { + let dir = tempdir().unwrap(); + let test_file = dir.path().join("large_file.txt"); + + // Create a file larger than 100KB (the LARGE_FILE_THRESHOLD) + let line = "This is a line of text that will be repeated many times to create a large file.\n"; + let large_content: String = line.repeat(2000); // ~160KB + assert!(large_content.len() > 100 * 1024, "Test file must be >100KB"); + + fs::write(&test_file, &large_content).unwrap(); + + // Get original metadata for comparison + let original_metadata = fs::metadata(&test_file).unwrap(); + let original_modified = original_metadata.modified().unwrap(); + + // Small delay to ensure filesystem timestamp would change if modified + std::thread::sleep(std::time::Duration::from_millis(50)); + + let json_input = r#"{ + "edits": [ + { + "type": "replace", + "path": "large_file.txt", + "search": "This is a line of text", + "replace": "This is MODIFIED text" + } + ] + }"#; + + let output = run_apply_edits(dir.path(), json_input, &["--dry-run"]); + + // Should succeed in dry-run + assert!(output.status.success(), "dry-run on large file should succeed"); + + // CRITICAL: File content must be unchanged + let after_content = fs::read_to_string(&test_file).unwrap(); + assert_eq!(after_content, large_content, "dry-run must not modify large file contents"); + + // CRITICAL: File metadata (timestamp) should be unchanged + let after_metadata = fs::metadata(&test_file).unwrap(); + let after_modified = after_metadata.modified().unwrap(); + assert_eq!(original_modified, after_modified, "dry-run must not change file modification time"); + + // Verify JSON output indicates dry-run success + let stdout = String::from_utf8_lossy(&output.stdout); + let json: serde_json::Value = serde_json::from_str(&stdout).unwrap(); + assert!(json["success"].as_bool().unwrap(), "JSON should indicate success"); +} + +#[test] +fn test_atomic_rollback_multiple_files() { + let dir = tempdir().unwrap(); + + // Create three files that will be modified + let file1 = dir.path().join("file1.txt"); + let file2 = dir.path().join("file2.txt"); + let file3 = dir.path().join("file3.txt"); + + fs::write(&file1, "file1 original content\n").unwrap(); + fs::write(&file2, "file2 original content\n").unwrap(); + fs::write(&file3, "file3 original content\n").unwrap(); + + // First two edits will succeed, third will fail (search not found) + // This tests that ALL prior successful edits are rolled back + let json_input = r#"{ + "edits": [ + { + "type": "replace", + "path": "file1.txt", + "search": "file1 original content", + "replace": "file1 MODIFIED content" + }, + { + "type": "replace", + "path": "file2.txt", + "search": "file2 original content", + "replace": "file2 MODIFIED content" + }, + { + "type": "replace", + "path": "file3.txt", + "search": "this string does not exist and will cause failure", + "replace": "replacement" + } + ] + }"#; + + // Run in atomic mode (default, no --partial flag) + let output = run_apply_edits(dir.path(), json_input, &[]); + + // Command should fail + assert!(!output.status.success(), "atomic mode should fail when any edit fails"); + + // CRITICAL: ALL files should be rolled back to original state + let file1_content = fs::read_to_string(&file1).unwrap(); + let file2_content = fs::read_to_string(&file2).unwrap(); + let file3_content = fs::read_to_string(&file3).unwrap(); + + assert_eq!(file1_content, "file1 original content\n", + "file1 must be rolled back in atomic mode"); + assert_eq!(file2_content, "file2 original content\n", + "file2 must be rolled back in atomic mode"); + assert_eq!(file3_content, "file3 original content\n", + "file3 should be unchanged (edit failed before modification)"); + + // Verify stderr mentions rollback + let stderr = String::from_utf8_lossy(&output.stderr); + assert!(stderr.contains("rollback") || stderr.contains("Rollback") || stderr.contains("rolled back") || stderr.contains("Rolling back"), + "stderr should mention rollback: {}", stderr); + + // Verify JSON output shows correct counts + let stdout = String::from_utf8_lossy(&output.stdout); + let json: serde_json::Value = serde_json::from_str(&stdout).unwrap(); + assert!(!json["success"].as_bool().unwrap(), "success should be false"); + assert_eq!(json["failed"], 1, "should report 1 failed edit"); +}