feat(report): POST /api/report/<id>/signal — extract structured miro_signal from simulation by LoryGlory · Pull Request #321 · 666ghj/MiroFish

LoryGlory · 2026-03-23T15:19:30Z

Problem

MiroFish produces rich simulation reports but no machine-readable output contract. External pipelines that want to consume a prediction probability (e.g. prediction-market bots, calibration trackers) have to parse unstructured markdown — or receive a JSON block that doesn't match any stable schema. This was reported in #277.

Solution

A new POST /api/report/<report_id>/signal endpoint that distils a completed report into a canonical probability signal with a stable schema.

Changes

`backend/app/services/signal_extractor.py` (new)

SignalExtractor.extract() takes the report's markdown_content and simulation_requirement and uses chat_json() to extract a structured thesis:

3 retry attempts, temperature 0.1 with 0.05 backoff — deterministic extraction
_trim_report() keeps the tail of long reports (conclusions matter most for signal extraction)
All fields validated and normalised server-side:
- p_yes clamped to [0.01, 0.99]
- confidence enforced to high | medium | low, falls back to medium
- action enforced to buy_yes | buy_no | hold, recomputed from p_yes if invalid
- regime defaults to uncertain if missing
_salvage() is wired as fallback_parser: if the LLM returns near-valid output, a regex scan extracts any float probability and constructs a minimal signal rather than failing

`backend/app/api/report.py`

New route POST /api/report/<report_id>/signal:

404 — report not found
400 — report not yet completed, or empty content
422 — LLM failed after all retry attempts (surfaced as ValueError)
200 — returns canonical signal

`backend/tests/services/test_signal_extractor.py` (new)

27 tests, LLMClient fully mocked:

Happy path: field values, metadata, to_dict() structure, LLM call parameters
Validation: p_yes clamping, invalid confidence/action fallbacks, missing regime
Report trimming: short reports unchanged, long reports trimmed keeping tail
_salvage: probability extraction, action derivation, confidence detection
LLM failure: ValueError propagates correctly

Canonical signal schema (v1.1)

{
  "signal_id": "<uuid>",
  "schema_version": "1.1",
  "report_id": "report_xxxx",
  "simulation_id": "sim_xxxx",
  "generated_at": "2026-...",
  "thesis": {
    "p_yes": 0.73,
    "confidence": "high",
    "action": "buy_yes",
    "regime": "consensus_forming",
    "summary": "Strong agent consensus supports a YES outcome.",
    "drivers": ["70%+ agent agreement", "positive social momentum"],
    "invalidators": ["marginal counter-narrative", "low information diversity"]
  }
}

Dependency

This PR is built on top of #318 (feat/llm-structured-output-reliability) and uses its retry and JSON repair infrastructure in chat_json(). It can be merged independently if #318 lands first, or merged together with it.

Fixes

Closes #277

Testing

cd backend && uv run pytest tests/ -v
# 57 passed in 0.23s

…r services - chat_json() gains configurable retry (max_attempts), temperature backoff (temperature_step), optional retry delay, and a fallback_parser hook for service-specific rescue logic - _clean_response_text: strip <think> tags and markdown code fences - _fix_truncated_json: use unescaped-quote parity instead of last-char check to avoid spuriously quoting numeric values; fixes broken repair for arrays - _try_fix_json: generic near-valid JSON salvage (newline normalisation, control-character stripping, greedy object extraction) - simulation_config_generator: replace raw OpenAI client with LLMClient, remove duplicated local retry loop and truncated-JSON repair, pass service-specific config salvage as fallback_parser - oasis_profile_generator: same refactor; keep rule-based profile fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

30 tests covering _clean_response_text (think tags, markdown fences), _fix_truncated_json (unclosed braces/brackets/strings), _try_fix_json (near-valid JSON recovery), and chat_json retry behaviour (max_attempts, temperature backoff, fallback_parser, ValueError after all attempts fail, finish_reason==length truncation repair, API exception handling). All tests use mocked OpenAI client — no real API calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…from simulation report Adds a canonical machine-readable probability signal endpoint that distils a completed simulation report into a structured prediction thesis. New: backend/app/services/signal_extractor.py - SignalExtractor.extract() calls chat_json() (3 attempts, temp 0.1, step 0.05) against the report markdown and returns a validated MiroSignal dataclass - Validates and normalises all fields: p_yes clamped to [0.01, 0.99], confidence/action enums enforced, action recomputed from p_yes when invalid - _trim_report() keeps the tail (conclusions) for long reports to stay within token limits - _salvage() fallback_parser recovers a minimal signal from partial LLM output using regex probability extraction New endpoint: POST /api/report/<report_id>/signal - 404 if report not found - 400 if report not yet completed or content is empty - 422 if LLM fails after all retry attempts - Returns canonical signal with thesis.{p_yes, confidence, action, regime, summary, drivers, invalidators} — schema_version 1.1 New: backend/tests/services/test_signal_extractor.py - 27 tests covering happy path, field validation/normalisation, report trimming, _salvage fallback, and LLM failure propagation - No real API calls — LLMClient fully mocked Closes 666ghj#277 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…gnal

LoryGlory and others added 3 commits March 23, 2026 15:45

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request LLM API Any questions regarding the LLM API labels Mar 23, 2026

mohamedsorour1998 added a commit to mohamedsorour1998/MiroFish that referenced this pull request Mar 27, 2026

Merge PR 666ghj#321: Structured LLM output + POST /api/report/<id>/si…

977065c

…gnal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(report): POST /api/report/<id>/signal — extract structured miro_signal from simulation#321

feat(report): POST /api/report/<id>/signal — extract structured miro_signal from simulation#321
LoryGlory wants to merge 3 commits into666ghj:mainfrom
LoryGlory:feat/report-signal-extraction

LoryGlory commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LoryGlory commented Mar 23, 2026

Problem

Solution

Changes

backend/app/services/signal_extractor.py (new)

backend/app/api/report.py

backend/tests/services/test_signal_extractor.py (new)

Canonical signal schema (v1.1)

Dependency

Fixes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`backend/app/services/signal_extractor.py` (new)

`backend/app/api/report.py`

`backend/tests/services/test_signal_extractor.py` (new)