Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions internal/daemon/hooks_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1039,9 +1039,20 @@ func TestHandleEventNoLogWhenNoHooksMatch(t *testing.T) {
}

func TestWaitUntilIdle_ConcurrentEvents(t *testing.T) {
t.Parallel()
// A dedicated stress test proving WaitUntilIdle waits past the event-processing boundary
// under timing races and concurrent broadcasts.
if runtime.GOOS != "windows" {
t.Parallel()
}

iterations := 50
numEvents := 10
if runtime.GOOS == "windows" {
// Each hook starts a PowerShell process on Windows. Keep the test broad
// enough to exercise concurrent broadcasts without exhausting CI time.
iterations = 5
numEvents = 4
}

tmpDir := t.TempDir()

Expand All @@ -1053,11 +1064,10 @@ func TestWaitUntilIdle_ConcurrentEvents(t *testing.T) {
},
}

for i := range 50 {
for i := range iterations {
hr, broadcaster := setupRunner(t, cfg)

var wg sync.WaitGroup
numEvents := 10

for j := range numEvents {
wg.Add(1)
Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/claude-code_review.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Do NOT build the project, run the test suite, or execute the code while reviewin
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/codex_review.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Do NOT search or read files outside the repository checkout, except for an expli
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/default_dirty.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/default_range.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/default_review.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/default_security.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ For each finding, provide:
- File and line reference
- The specific code path an attacker would exploit and what they gain
- Suggested remediation
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file-level when the issue spans a range) and describe a plausible exploit path. The severity must match the exploitability you described. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/templates/gemini_review.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Do NOT build the project, run the test suite, or execute the code while reviewin
- **Location**: File and line number
- **Problem**: Concise description
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- If no issues are found, state "No issues found."

## Review Criteria
Expand Down
27 changes: 27 additions & 0 deletions internal/prompt/templates_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,33 @@ func TestGetSystemPrompt_Fallbacks(t *testing.T) {
}
}

func TestGetSystemPrompt_ReviewPromptsRequireFindingSeparators(t *testing.T) {
fixedTime := time.Date(2030, 6, 15, 0, 0, 0, 0, time.UTC)
mockNow := func() time.Time { return fixedTime }

tests := []struct {
name string
agent string
command string
}{
{name: "default review", agent: "test", command: "review"},
{name: "default dirty", agent: "test", command: "dirty"},
{name: "default range", agent: "test", command: "range"},
{name: "default security", agent: "test", command: "security"},
{name: "claude review", agent: "claude-code", command: "review"},
{name: "codex review", agent: "codex", command: "review"},
{name: "gemini review", agent: "gemini", command: "review"},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := getSystemPrompt(tt.agent, tt.command, mockNow)

assert.Contains(t, got, "Separate multiple findings with `---` on its own line.")
})
}
}

func TestGetSystemPrompt_DateInjection(t *testing.T) {
fixedTime := time.Date(2030, 6, 15, 0, 0, 0, 0, time.UTC)
fixedDateStr := "Current date: 2030-06-15 (UTC)"
Expand Down
1 change: 1 addition & 0 deletions internal/prompt/testdata/golden/dirty_review.golden
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
6 changes: 2 additions & 4 deletions internal/prompt/testdata/golden/dirty_truncated.golden
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down Expand Up @@ -112,9 +113,6 @@ index 0000000..1111111
+a line of content
+a line of content
+a line of content
+a line of content
+a line of content
+a line of content
+a line of conte
+a line of co
... (truncated)
```
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/testdata/golden/range_truncated.golden
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Do NOT search or read files outside the repository checkout, except for an expli
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/testdata/golden/security_review.golden
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ For each finding, provide:
- File and line reference
- The specific code path an attacker would exploit and what they gain
- Suggested remediation
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file-level when the issue spans a range) and describe a plausible exploit path. The severity must match the exploitability you described. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Do NOT build the project, run the test suite, or execute the code while reviewin
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
1 change: 1 addition & 0 deletions internal/prompt/testdata/golden/single_review_codex.golden
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Do NOT search or read files outside the repository checkout, except for an expli
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Do NOT build the project, run the test suite, or execute the code while reviewin
- **Location**: File and line number
- **Problem**: Concise description
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- If no issues are found, state "No issues found."

## Review Criteria
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Do NOT search or read files outside the repository checkout, except for an expli
- **Location**: File and line number where possible
- **Problem**: Concise description of the issue
- **Fix**: Brief suggested fix
- Separate multiple findings with `---` on its own line.
- After the findings, add `## Summary` with a single sentence summarizing the change.
- If you find no issues, start immediately with `No issues found.` and then add `Summary: <one sentence>`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ After reviewing, provide:
- File and line reference where possible
- What specifically goes wrong if this is not fixed (concrete harm, not "violates best practices")
- Suggested fix
- Separate multiple findings with `---` on its own line.

Before finalizing, verify your review: every finding must reference the narrowest applicable location (line number when possible, file or diff-level when the issue is an omission or spans a range), the severity must match the impact you described, and no two findings should contradict each other. Drop any finding that fails these checks.

Expand Down
Loading