Curator: evaluate & tune LLM consolidation quality (after outcome telemetry)

## Context

The periodic skill curator (`crates/bot/src/learning_curator.rs`, prompt `CURATOR_SYSTEM_PROMPT` in `crates/right-codegen/src/agent_def.rs`) ships an LLM consolidation pass: umbrella-merge near-duplicate `rightx-*` skills, demote narrow skills into an umbrella's `references/`, archive with `absorbed_into`. It is enabled by default and runs today.

What's missing: **we have no measurement of how *good* those consolidation decisions are.** We don't know whether it merges the right skills, over-merges, or rarely fires usefully. The marketing claim ("the curator decides two skills are duplicates and merges one into the other") is currently unproven in any deployment.

## Blocked on

Curator outcome telemetry + dashboard observability (track "A"). We need run-level outcome data — what each curator pass merged/archived/demoted and why — before we can judge or tune quality. Do this issue **after** that lands.

## Scope (once telemetry exists)

- Evaluate the consolidation pass against a real skill library: are umbrella / demote / archive decisions correct? false merges? missed duplicates? does it act at all?
- Tune `CURATOR_SYSTEM_PROMPT` from observed behavior.
- Consider a lightweight eval harness / golden cases for consolidation decisions.

## References

- Deferred Phase-2: `docs/superpowers/specs/2026-05-22-prefilter-classifier-and-curator-state-design.md` §11 (outcome-driven prompt calibration).
- Curator design: `docs/superpowers/specs/2026-05-22-skill-learning-writer-curator-design.md`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Curator: evaluate & tune LLM consolidation quality (after outcome telemetry) #132

Context

Blocked on

Scope (once telemetry exists)

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Curator: evaluate & tune LLM consolidation quality (after outcome telemetry) #132

Description

Context

Blocked on

Scope (once telemetry exists)

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions