HartBrook · HartBrook · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -11,7 +11,7 @@
       "name": "look",
       "source": "./src",
       "description": "Sequential code review with fresh agent contexts. Runs multiple independent review passes to catch more issues.",
-      "version": "0.2.1",
+      "version": "0.3.0",
       "author": { "name": "HartBrook" },
       "repository": "https://github.com/HartBrook/lookagain",
       "license": "MIT",

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.3.0] - 2026-01-28
+
+### Changed
+
+- **Breaking**: Replaced unsupported `arguments:` frontmatter array and `$ARGUMENTS.<name>` dot-access syntax with agent-side parsing of `$ARGUMENTS`. Claude Code only supports `$ARGUMENTS` (whole string) and `$ARGUMENTS[N]` (positional) — the named dot-access syntax was never interpolated, causing the agent to see literal placeholder text and fall back to safe defaults.
+- Commands now include a "Parse arguments" section with a defaults table. The agent parses `key=value` pairs from the raw `$ARGUMENTS` string and applies documented defaults for missing keys.
+- Commands log resolved configuration to make argument values visible and debuggable.
+- Frontmatter uses `argument-hint` (supported) instead of `arguments:` (unsupported).
+- Updated static tests to enforce the new pattern and reject the old `arguments:` frontmatter.
+- Updated behavioral evals with new test cases for empty arguments (defaults) and partial arguments.
+- Updated CONTRIBUTING.md prompt authoring guidance for the correct argument pattern.
+
 ## [0.2.1] - 2026-01-28
 
 ### Fixed

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -67,7 +67,7 @@ make test
 make eval
 ```
 
-`make test` runs fast, offline checks that validate plugin structure: file existence, JSON validity, frontmatter fields, cross-references between manifests, and that all frontmatter arguments are referenced as `$ARGUMENTS.<name>` in the instruction body (not just in display sections).
+`make test` runs fast, offline checks that validate plugin structure: file existence, JSON validity, frontmatter fields, cross-references between manifests, and that commands accepting arguments use the correct pattern (`argument-hint` in frontmatter, `$ARGUMENTS` placeholder, and a defaults table in the body).
 
 `make eval` runs [promptfoo](https://promptfoo.dev) evals that send the interpolated prompts to Claude and assert on behavioral correctness. For example, it verifies that `auto-fix=false` causes the model to skip fixes, and that `passes=5` results in 5 planned passes.
 
@@ -120,9 +120,11 @@ You can also test the plugin through the marketplace install flow, which is clos
 
 When editing or adding command prompts in `src/commands/`:
 
-- Define arguments in the YAML frontmatter with `name`, `description`, and `default`.
-- Reference arguments in the instruction body using `$ARGUMENTS.<name>` — not just in display sections. The executing agent needs to see the interpolated value at the point where it makes decisions. For example, write `If $ARGUMENTS.auto-fix is true` rather than `If auto-fix is enabled`.
-- `make test` enforces that every frontmatter argument appears as `$ARGUMENTS.<name>` somewhere in the body. If you add an argument, the test will fail until you reference it.
+- **Use `$ARGUMENTS` for the raw string.** Claude Code replaces `$ARGUMENTS` with whatever the user typed after the command name. There is no `$ARGUMENTS.name` dot-access syntax — only `$ARGUMENTS` (whole string) and `$ARGUMENTS[N]` (positional). Do NOT use an `arguments:` array in frontmatter — it is not a supported Claude Code feature and will not be interpolated.
+- **Add `argument-hint`** in frontmatter to document expected input format (e.g., `argument-hint: "[key=value ...]"`).
+- **Include a defaults table** in the body listing each key, its default, and a description. Instruct the agent to parse `key=value` pairs from `$ARGUMENTS` and fall back to defaults for missing keys.
+- **Log the resolved configuration** so it is visible in output and reviewable in evals.
+- `make test` enforces that commands using `$ARGUMENTS` have `argument-hint` in frontmatter, a defaults table in the body, and do NOT use the unsupported `arguments:` frontmatter array.
 - After changing prompt logic, run `make eval` to verify models still interpret the arguments correctly.
 
 ## Pull Requests

diff --git a/evals/prompt-loader.js b/evals/prompt-loader.js
@@ -1,6 +1,7 @@
-// Loads a markdown command file, strips frontmatter, interpolates
-// $ARGUMENTS.* tokens with test-case variables, and prepends a
-// meta-instruction so the model describes its plan without executing.
+// Loads a markdown command file, strips frontmatter, replaces the
+// $ARGUMENTS placeholder with a key=value string built from test-case
+// variables, and prepends a meta-instruction so the model describes
+// its plan without executing.
 
 const fs = require("fs");
 const path = require("path");
@@ -18,18 +19,17 @@ function generatePrompt(context) {
   // Strip YAML frontmatter (between opening and closing ---)
   const stripped = raw.replace(/^---\n[\s\S]*?\n---\n/, "");
 
-  // Replace $ARGUMENTS.<name> with matching arg_<name> variable.
-  // Argument names may contain hyphens (e.g. auto-fix, max-passes).
-  const interpolated = stripped.replace(
-    /\$ARGUMENTS\.([\w-]+)/g,
-    (_match, name) => {
-      const key = `arg_${name}`;
-      if (key in vars) {
-        return vars[key];
-      }
-      return _match; // leave unresolved tokens as-is
-    },
-  );
+  // Build a key=value argument string from all vars prefixed with arg_.
+  // e.g. { arg_passes: "5", arg_auto-fix: "false" } → "passes=5 auto-fix=false"
+  const argPairs = Object.entries(vars)
+    .filter(([k]) => k.startsWith("arg_"))
+    .map(([k, v]) => `${k.slice(4)}=${v}`)
+    .join(" ");
+
+  // Replace the $ARGUMENTS placeholder with the built argument string.
+  // This mirrors what Claude Code does at runtime: $ARGUMENTS is replaced
+  // with the raw text the user typed after the command name.
+  const interpolated = stripped.replace(/\$ARGUMENTS/g, argPairs);
 
   const meta = [
     "You are analyzing a Claude Code plugin command prompt.",

diff --git a/evals/promptfooconfig.yaml b/evals/promptfooconfig.yaml
@@ -1,4 +1,4 @@
-description: "Behavioral evals for lookagain prompt interpolation"
+description: "Behavioral evals for lookagain argument handling"
 
 prompts:
   - file://prompt-loader.js
@@ -9,6 +9,22 @@ providers:
       max_tokens: 2048
 
 tests:
+  # ==================================================================
+  # again.md — no arguments (defaults)
+  # ==================================================================
+  - description: "no arguments → model uses all defaults (auto-fix=true, passes=3, thorough)"
+    vars:
+      prompt_file: src/commands/again.md
+    assert:
+      - type: llm-rubric
+        value: >
+          The response must use default values for all settings since the
+          argument string is empty: 3 passes, staged target, auto-fix true,
+          thorough model, and max-passes 7. It must describe applying fixes
+          for must_fix issues (since auto-fix defaults to true). It should
+          NOT say arguments are missing or unavailable — an empty string
+          simply means the user wants all defaults.
+
   # ==================================================================
   # again.md — auto-fix interpretation
   # ==================================================================
@@ -114,6 +130,21 @@ tests:
           reviewing all changes on the current branch versus the base
           branch. It should reference branch comparison or merge-base.
 
+  # ==================================================================
+  # again.md — partial arguments (only auto-fix specified)
+  # ==================================================================
+  - description: "only auto-fix=false → defaults for everything else, no fixing"
+    vars:
+      prompt_file: src/commands/again.md
+      arg_auto-fix: "false"
+    assert:
+      - type: llm-rubric
+        value: >
+          The response must parse auto-fix=false from the argument string
+          and use defaults for all other settings (3 passes, staged target,
+          thorough model, max-passes 7). It must NOT apply any fixes since
+          auto-fix is explicitly false.
+
   # ==================================================================
   # tidy.md — all flag
   # ==================================================================
@@ -139,3 +170,19 @@ tests:
           The response must describe calculating a cutoff date by subtracting
           3 days from today, and only removing runs older than that cutoff.
           It should keep runs from the last 3 days.
+
+  # ==================================================================
+  # tidy.md — no arguments (defaults)
+  # ==================================================================
+  - description: "tidy with no arguments → keep=1, all=false"
+    vars:
+      prompt_file: src/commands/tidy.md
+    assert:
+      - type: llm-rubric
+        value: >
+          The response must use default values since the argument string is
+          empty: keep=1 (keep runs from the last 1 day) and all=false (do
+          NOT remove all runs). It should describe date-based filtering
+          with a 1-day retention window. An empty argument string means
+          the user wants all defaults — it should NOT say arguments are
+          missing or unavailable.
diff --git a/scripts/test.sh b/scripts/test.sh
@@ -129,48 +129,51 @@ test_frontmatter() {
     check_frontmatter "$PROJECT_ROOT/src/skills/lookagain-output-format/SKILL.md" name description
 }
 
-test_argument_interpolation() {
-    # Verify that arguments defined in frontmatter are referenced using
-    # $ARGUMENTS.<name> syntax in the instruction body, not just in the
-    # Configuration display section. This prevents the executing agent
-    # from missing argument values and falling back to safe defaults.
+test_argument_handling() {
+    # Verify that command files using arguments follow the correct pattern:
+    # 1. Frontmatter has argument-hint (not the unsupported arguments: array)
+    # 2. Body contains $ARGUMENTS placeholder for the raw argument string
+    # 3. Body contains a defaults table with Key/Default columns
+    # This ensures the agent receives and parses arguments at runtime
+    # rather than relying on non-existent compile-time interpolation.
 
     for file in "$PROJECT_ROOT"/src/commands/*.md; do
         local relpath="${file#"$PROJECT_ROOT"/}"
 
-        # Extract argument names from frontmatter
-        local args
-        args=$(awk '
-            NR==1 && /^---$/ { in_fm=1; next }
-            in_fm && /^---$/ { exit }
-            in_fm && /^  - name: / { gsub(/^  - name: /, ""); print }
-        ' "$file")
-
-        if [[ -z "$args" ]]; then
-            continue
-        fi
+        local frontmatter
+        frontmatter=$(awk 'NR==1{next} /^---$/{exit} {print}' "$file")
 
-        # Extract the body (everything after the second ---)
         local body
         body=$(awk '
             NR==1 && /^---$/ { in_fm=1; next }
             in_fm && /^---$/ { in_fm=0; next }
             !in_fm { print }
         ' "$file")
 
-        # For each argument, verify $ARGUMENTS.<name> appears in the body
-        local all_found=1
-        while IFS= read -r arg; do
-            local ref="\$ARGUMENTS.${arg}"
-            if ! echo "$body" | grep -qF "$ref"; then
-                fail "$relpath: argument '$arg' defined but \$ARGUMENTS.$arg never used in body"
-                all_found=0
-            fi
-        done <<< "$args"
-
-        if [[ $all_found -eq 1 ]]; then
-            pass "$relpath: all arguments interpolated in body"
+        # Skip commands that don't accept arguments
+        if ! echo "$body" | grep -qF '$ARGUMENTS'; then
+            continue
+        fi
+
+        # Must NOT use the unsupported arguments: array in frontmatter
+        if echo "$frontmatter" | grep -q "^arguments:"; then
+            fail "$relpath: uses unsupported 'arguments:' frontmatter — use 'argument-hint:' and agent-side parsing instead"
+            continue
+        fi
+
+        # Must have argument-hint in frontmatter
+        if ! echo "$frontmatter" | grep -q "^argument-hint:"; then
+            fail "$relpath: missing 'argument-hint:' in frontmatter"
+            continue
         fi
+
+        # Must have a defaults table (Key | Default header pattern)
+        if ! echo "$body" | grep -q "| Key | Default"; then
+            fail "$relpath: missing defaults table (expected '| Key | Default |' header)"
+            continue
+        fi
+
+        pass "$relpath: argument handling correct"
     done
 }
 
@@ -369,8 +372,8 @@ echo "--- frontmatter ---"
 test_frontmatter
 echo ""
 
-echo "--- argument interpolation ---"
-test_argument_interpolation
+echo "--- argument handling ---"
+test_argument_handling
 echo ""
 
 echo "--- cross-references ---"

diff --git a/src/commands/again.md b/src/commands/again.md
@@ -1,45 +1,53 @@
 ---
 name: again
 description: Run sequential code review passes with fresh contexts to catch more issues
-arguments:
-  - name: passes
-    description: Number of review passes to run
-    default: "3"
-  - name: target
-    description: "What to review: staged, commit, branch, or a file/directory path"
-    default: "staged"
-  - name: auto-fix
-    description: Automatically fix must_fix issues between passes (true/false)
-    default: "true"
-  - name: model
-    description: "Reviewer model: fast (haiku), balanced (sonnet), thorough (inherit)"
-    default: "thorough"
-  - name: max-passes
-    description: Maximum passes if must_fix issues persist
-    default: "7"
+argument-hint: "[key=value ...]"
 ---
 
 # Iterative Code Review
 
 You orchestrate sequential, multi-pass code review. Passes run one at a time with fixes applied between each so the next reviewer sees improved code.
 
-## Configuration
+## Phase 0: Setup
 
-- **Passes**: $ARGUMENTS.passes
-- **Target**: $ARGUMENTS.target
-- **Auto-fix**: $ARGUMENTS.auto-fix
-- **Model**: $ARGUMENTS.model
-- **Max passes**: $ARGUMENTS.max-passes
+### Parse arguments
 
-## Phase 0: Setup
+The user may pass key=value pairs after the command name. The raw argument string is:
+
+> $ARGUMENTS
+
+If the argument string is empty or blank, the user provided no overrides — use all defaults. Parse `key=value` pairs from the string. For any key not provided, use the default.
+
+| Key | Default | Description |
+|---|---|---|
+| `passes` | `3` | Number of review passes to run |
+| `target` | `staged` | What to review: `staged`, `commit`, `branch`, or a file/directory path |
+| `auto-fix` | `true` | Automatically fix must_fix issues between passes (`true` or `false`) |
+| `model` | `thorough` | Reviewer model: `fast` (haiku), `balanced` (sonnet), `thorough` (inherit) |
+| `max-passes` | `7` | Maximum passes if must_fix issues persist |
+
+**Log the resolved configuration** so it is visible in the output:
+
+```
+Configuration:
+  passes:     <resolved value>
+  target:     <resolved value>
+  auto-fix:   <resolved value>
+  model:      <resolved value>
+  max-passes: <resolved value>
+```
+
+### Generate run ID
+
+Generate a run ID: `YYYY-MM-DDTHH-MM-SS`. All output goes under `.lookagain/<run-id>/`.
+
+### Orchestrator role
 
-1. Generate a run ID: `YYYY-MM-DDTHH-MM-SS`. All output goes under `.lookagain/<run-id>/`.
-2. Do NOT read the codebase yourself. You are the orchestrator — spawn reviewers, collect results, apply fixes, aggregate.
-3. Create a TodoWrite list with one item per pass plus aggregation. Mark items `in_progress` when starting and `completed` when done.
+Do NOT read the codebase yourself. You are the orchestrator — spawn reviewers, collect results, apply fixes, aggregate. Create a TodoWrite list with one item per pass plus aggregation. Mark items `in_progress` when starting and `completed` when done.
 
 ### Resolve scope
 
-Determine the scope instruction to pass to each reviewer:
+Using the resolved `target` value, determine the scope instruction to pass to each reviewer:
 
 | Target value | Scope instruction for reviewer |
 |---|---|
@@ -50,7 +58,7 @@ Determine the scope instruction to pass to each reviewer:
 
 ### Resolve model
 
-Map the model argument to the Task tool model parameter:
+Using the resolved `model` value, map to the Task tool model parameter:
 
 | Model value | Task model |
 |---|---|
@@ -62,17 +70,17 @@ Map the model argument to the Task tool model parameter:
 
 CRITICAL: Passes run in sequence, NOT in parallel. Each pass reviews code after previous fixes.
 
-For each pass (1 through $ARGUMENTS.passes):
+For each pass (1 through the resolved `passes` value):
 
 **Review**: Spawn a fresh subagent via the Task tool using the `lookagain-reviewer` agent. Include: pass number, scope instruction, and instruction to use the `lookagain-output-format` skill. Set the model parameter based on the resolved model. Do NOT include findings from previous passes.
 
 **Collect**: Parse the JSON response. Store findings and track which pass found each issue.
 
-**Fix**: If `$ARGUMENTS.auto-fix` is `true`, apply fixes for `must_fix` issues only. Minimal changes, no refactoring.
+**Fix**: If the resolved `auto-fix` value is `true`, apply fixes for `must_fix` issues only. Minimal changes, no refactoring. If `auto-fix` is `false`, skip this step entirely — do not apply any fixes.
 
 **Log**: "Pass N complete. Found X must_fix, Y should_fix, Z suggestions."
 
-After completing $ARGUMENTS.passes passes, if `must_fix` issues remain and total passes < $ARGUMENTS.max-passes, run additional passes.
+After completing the configured number of passes, if `must_fix` issues remain and total passes < the resolved `max-passes` value, run additional passes.
 
 ## Phase 2: Aggregate
 
@@ -122,6 +130,6 @@ Include the count of previous runs (glob `.lookagain/????-??-??T??-??-??/`, subt
 1. **Sequential**: Never launch passes in parallel. Each must complete before the next starts.
 2. **Fresh context**: Always use the Task tool for subagents.
 3. **Independence**: Never tell subagents what previous passes found.
-4. **Minimal fixes**: Only change what's necessary when `$ARGUMENTS.auto-fix` is `true`.
+4. **Minimal fixes**: Only apply fixes when the resolved `auto-fix` value is `true`.
 5. **Valid JSON**: If subagent output fails to parse, log the error and continue.
-6. **Respect max-passes**: Never exceed $ARGUMENTS.max-passes.
+6. **Respect max-passes**: Never exceed the resolved `max-passes` value.