Reviewer Confidence Thresholds: Reduce Noise in Code Review Output

## Summary

Add confidence thresholds to Reviewer agents so only findings with >80% confidence are reported. Reduces noise in `/code-review` output and increases signal-to-noise ratio. Inspired by Harness Alpha's confidence-based filtering pattern.

## Motivation

Harness Alpha's code-reviewer and all language-specific reviewers (Python, Go, Kotlin) require **>80% confidence before flagging issues**. Their philosophy: "Better to miss something than flood with false positives."

### The Problem

Without confidence thresholds, reviewers report everything they notice — including:
- Style preferences that aren't clearly wrong
- Potential issues that depend on context the reviewer can't see
- Edge cases that are handled elsewhere in the codebase
- "Could be improved" suggestions that distract from real issues

This creates review fatigue. When 40% of findings are noise, users start ignoring all findings.

### The Pattern

Each review finding must include a confidence assessment:

```markdown
### Finding: Missing error boundary around API call
- **Severity**: HIGH
- **Confidence**: 90%
- **File**: src/api/client.ts:45
- **Issue**: Uncaught promise rejection could crash the app
- **Fix**: Wrap in try/catch with error reporting

### Finding: Variable name could be more descriptive
- **Severity**: LOW
- **Confidence**: 55%  ← FILTERED (below 80% threshold)
```

Only findings with confidence ≥ 80% appear in the review output. Lower-confidence findings are either dropped or collected in a separate "suggestions" section.

### Why This Matters for DevFlow

Our `/code-review` spawns 7-11 Reviewer agents in parallel. Each reports all findings regardless of confidence. The Synthesizer deduplicates and merges, but doesn't filter by confidence — noise from individual reviewers propagates to the final report.

Adding confidence thresholds would:
1. Reduce review report length by ~30-40% (estimated noise ratio)
2. Increase user trust in review output
3. Focus attention on real issues

## Technical Approach

### 1. Update Reviewer Agent Prompt

Add confidence assessment requirement to `shared/agents/reviewer.md`:

```markdown
## Finding Format

For each issue found, assess your confidence (0-100%):
- **90-100%**: Certain — clear bug, security vulnerability, or standards violation
- **80-89%**: High — likely issue based on context and patterns
- **60-79%**: Medium — possible issue, depends on context not visible
- **Below 60%**: Low — subjective preference or uncertain

**Only report findings with confidence ≥ 80%.** 

Collect lower-confidence observations in a separate "Suggestions" section (max 3 items).
```

### 2. Structured Output

```markdown
## Critical & High Confidence Findings
[Only ≥80% confidence findings, severity-ordered]

## Suggestions (Lower Confidence)
[Max 3 items, clearly labeled as suggestions not findings]
```

### 3. Synthesizer Integration

Update Synthesizer to:
- Respect confidence levels during deduplication
- Boost confidence when multiple reviewers flag the same issue
- Maintain the ≥80% threshold in the final report

## Effort & Impact

- **Effort**: Small (agent prompt update + synthesizer tweak)
- **Impact**: Medium — cleaner reviews, higher user trust, less review fatigue
- **Risk**: Low — worst case is missing a marginal issue that a user would have ignored anyway

## Cross-Reference

- Enhances `/code-review` output quality
- Complements #107 item 9 (De-Sloppify Categories) — both aim to increase signal-to-noise
- Independent of #98 (PreToolUse enforcement) and #110 (PostToolUse quality hooks)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reviewer Confidence Thresholds: Reduce Noise in Code Review Output #113

Summary

Motivation

The Problem

The Pattern

Why This Matters for DevFlow

Technical Approach

1. Update Reviewer Agent Prompt

2. Structured Output

3. Synthesizer Integration

Effort & Impact

Cross-Reference

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reviewer Confidence Thresholds: Reduce Noise in Code Review Output #113

Description

Summary

Motivation

The Problem

The Pattern

Why This Matters for DevFlow

Technical Approach

1. Update Reviewer Agent Prompt

2. Structured Output

3. Synthesizer Integration

Effort & Impact

Cross-Reference

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions