Workers can validate different checkouts and silently disagree on build/test results

## Summary
Workers can silently validate different filesystem states because review/test execution is not pinned hard enough to the same checkout/commit, and testers can run decisive pass/fail checks from a dirty shared workspace.

## Incident
First Light issue `#221` / PR `#222` exposed this clearly:
- reviewer approved based on the PR diff and described the fixed code
- tester ran the compile repro in `/home/sai/.openclaw/workspace/firstlight`
- tester reproduced the old compile error and reported that the fix was not present locally
- later verification showed:
  - `origin/main` at merged commit `24164242926923d9accc44cfa65892c1506e77d8` **does** contain the fix
  - the shared workspace was on a different dirty local `HEAD` and still had the old broken lines

This created a false contradiction where "dev/reviewer says compile should pass" and "tester says compile fails", but they were effectively validating different trees.

## Evidence
Merged PR:
- `yaqub0r/firstlight#222`
- merge commit: `24164242926923d9accc44cfa65892c1506e77d8`

Fixed code on `origin/main`:
```csharp
hasHarnessSeed = harnessSeed.HasValue,
harnessSeed = harnessSeed ?? 0,
```

Broken code still present in the shared local workspace during test:
```csharp
hasHarnessSeed = hasHarnessSeed,
harnessSeed = harnessSeed,
```

Observed local mismatch during follow-up:
- local workspace `HEAD`: `e1c54402`
- `origin/main`: `24164242`
- workspace had additional dirty tracked changes and untracked files

## Problem
Current worker coordination appears to allow at least one of these bad behaviors:
1. tester uses a shared mutable workspace instead of an isolated clean worktree pinned to the target ref
2. reviewer/dev/tester are not all given the same mandatory target commit/checkout contract
3. decisive verification comments do not automatically include enough provenance (`git rev-parse HEAD`, dirty state, worktree path, target PR/commit)
4. workers do not fail fast when the workspace is dirty or not at the intended ref

## Expected behavior
For review/test/build tasks, DevClaw should ensure all workers validate the same code state.

Suggested contract:
- pin every worker to an explicit commit SHA / PR head SHA / merge commit
- prefer isolated clean worktrees for developer/tester/reviewer execution
- before any pass/fail verdict, record and/or enforce:
  - repo path
  - worktree path
  - branch/ref
  - `git rev-parse HEAD`
  - dirty/clean status
- tester should refuse a definitive verification run if the tree is dirty or if HEAD does not match the requested target commit

## Why this matters
Without this, DevClaw can produce conflicting authoritative-seeming comments about the same issue even when reality is deterministic, which undermines trust in the workflow.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workers can validate different checkouts and silently disagree on build/test results #171

Summary

Incident

Evidence

Problem

Expected behavior

Why this matters

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Workers can validate different checkouts and silently disagree on build/test results #171

Description

Summary

Incident

Evidence

Problem

Expected behavior

Why this matters

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions