Skip to content

Phase 2: Validate feedback-derived T-I-F decision provenance#163

Merged
ferhimedamine merged 4 commits into
Dakera-AI:mainfrom
SeCuReDmE-main-dev:phase2/tif-feedback-provenance-validation
Jun 12, 2026
Merged

Phase 2: Validate feedback-derived T-I-F decision provenance#163
ferhimedamine merged 4 commits into
Dakera-AI:mainfrom
SeCuReDmE-main-dev:phase2/tif-feedback-provenance-validation

Conversation

@SeCuReDmE-main-dev

Copy link
Copy Markdown
Contributor

Summary

Phase 2 validation package for RFC #161: feedback-derived T-I-F reliability and session-scoped decision provenance.

This extends the merged Phase 1 path without changing Dakera engine behavior.

Scope

  • public REST API only
  • metadata.reliability
  • local feedback-derived T-I-F evaluator
  • session-scoped metadata.decision_provenance trace memories
  • explicit memory links from decision traces to evidence memories
  • scenario recall proof using /v1/memory/recall
  • associated recall proof using include_associated=true
  • no engine changes
  • no recall filters
  • no SDK schema changes

Scenarios

  • Coding assistant: obsolete endpoint becomes contradiction evidence.
  • Research agent: weak-source evidence raises indeterminacy and asks clarification.
  • Customer support: outdated policy is surfaced as contradiction evidence, not deleted.

Validation

Validated locally against Dakera 0.11.90 on localhost port 3200.

python -m py_compile examples\tif-provenance\validate_tif_provenance.py examples\tif-reliability\validate_tif_reliability.py
python examples\tif-provenance\validate_tif_provenance.py --self-test
python examples\tif-reliability\validate_tif_reliability.py --self-test
docker compose -f docker\docker-compose.tif-phase1.yml down
docker compose -f docker\docker-compose.tif-phase1.yml up -d
python examples\tif-provenance\validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
$validationExit = $LASTEXITCODE
docker compose -f docker\docker-compose.tif-phase1.yml down
exit $validationExit

Result: all three scenarios passed.

The final runtime proof includes:

  • scenario_recall_proof: true
  • associated_recall_missing_ids: []
  • associated_recall_proof: true
  • session_trace_proof: true
  • passed: true

Review loop

Fork review PR: SeCuReDmE-main-dev#2

Review status before upstream PR:

  • Qodo review: Bugs (0) after correction loop
  • Codex review on final commit 705681200c: no major issues found
  • Gemini feedback addressed or documented as non-actionable
  • Tests rerun after every correction batch

RFC context

@ferhimedamine ferhimedamine left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phase 2 Deep Review — Approved

Validation Results (Dakera v0.11.90, in-memory, auth disabled)

Self-test (offline): 3/3 pass
Runtime validation: 3/3 pass with full proofs

Scenario Baseline → T-I-F Decision Recall Session Associated
coding-assistant reuse_top_memorysurface_contradiction
research-agent reuse_top_memoryask_clarification
customer-support reuse_top_memorysurface_contradiction

Phase 1 backward compat: 4/4 pass ✓

Code Review

Strengths:

  • stdlib-only (zero external dependencies)
  • Deep-copies metadata before mutation — data safety
  • Unique agent_id per run — no test data pollution
  • Session cleanup in finally block
  • Defensive response normalization handles multiple API response shapes
  • Healthcheck properly validates ready: true
  • Security: localhost-only binding, auth disabled only for local validation

Minor notes (non-blocking):

  • TimeoutError catch at lines ~736/746 may not trigger — urllib wraps timeouts in URLError. The defensive retry is still fine as a pattern; just noting it.
  • assert direct is not Noneassert is removed by -O optimization. A conditional raise would be slightly more robust for production-grade scripts.

T-I-F Math Verification

All 6 derivations manually verified:

  • coding-obsolete: (0.38,0.20,0.34) + 2×downvote → (0.18,0.30,0.64) → f≥0.50 ✓
  • coding-current: (0.66,0.14,0.10) + upvote → (0.76,0.11,0.05) → t≥0.70 ✓
  • research-uncertain: (0.44,0.18,0.18) + 2×flag → (0.34,0.58,0.38) → i≥0.50 ✓
  • research-backed: (0.68,0.16,0.08) + upvote → (0.78,0.13,0.03) → t≥0.70 ✓
  • support-outdated: (0.42,0.18,0.30) + downvote+flag → (0.27,0.43,0.55) → f≥0.50 ✓
  • support-current: (0.67,0.12,0.07) + upvote → (0.77,0.09,0.02) → t≥0.70 ✓

Verdict

Clean Phase 2 delivery. Feedback-derived T-I-F, session provenance, and associated recall all working against current production API. No engine changes. Examples-only scope maintained.

The docker-compose version bump from 0.11.81 to 0.11.90 and the Phase 1 healthcheck improvement are good additions.

@ferhimedamine ferhimedamine merged commit be2892a into Dakera-AI:main Jun 12, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants