Skip to content

Design & implement confidence / uncertainty signals #130

@AlexChesser

Description

@AlexChesser

Summary

Design the use of LLM confidence/uncertainty signals (logprobs, token probabilities) for decision-making within pipelines. This is NOT IN SPEC and requires spec authoring.

Parent issue: #105 — Missing Modality F

Why

Modern LLM APIs expose logprobs. Research on selective prediction shows you can use confidence scores to decide whether to accept output, retry with more context, or escalate to a better model. This ties directly into model cascading — low-confidence responses trigger escalation to a more capable (and expensive) model.

Dependencies

Design Decisions Needed

  • Which confidence signals — logprobs? Aggregated confidence score? Model-reported uncertainty?
  • How confidence is exposed — template variable {{ step.<id>.confidence }}? Structured metadata?
  • How confidence drives flow — condition expressions? Automatic cascading trigger?
  • Provider differences — not all APIs expose the same signals. How to handle missing data?
  • Whether to support calibration (mapping raw logprobs to meaningful confidence scores)

Spec Work Required

New spec section needed. Depends on native LLM provider support (#128) being designed first.

Acceptance Criteria

  • Spec section authored
  • Confidence signals captured from LLM API responses
  • Signals available as template variables
  • Signals can drive conditional flow (e.g. escalation)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions