Dakera-AI · ferhimedamine · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
diff --git a/docker/docker-compose.tif-phase1.yml b/docker/docker-compose.tif-phase1.yml
@@ -1,6 +1,6 @@
 services:
   dakera:
-    image: ${DAKERA_IMAGE:-ghcr.io/dakera-ai/dakera:0.11.81}
+    image: ${DAKERA_IMAGE:-ghcr.io/dakera-ai/dakera:0.11.90}
     ports:
       - "127.0.0.1:3200:3000"
       - "127.0.0.1:51051:50051"

diff --git a/examples/tif-provenance/README.md b/examples/tif-provenance/README.md
@@ -0,0 +1,102 @@
+# T-I-F Feedback Provenance Phase 2
+
+This example validates Phase 2 of the Dakera T-I-F decision provenance RFC:
+
+https://github.com/Dakera-AI/dakera-deploy/issues/161
+
+Phase 1 proved that `metadata.reliability` survives store and recall and can
+change agent-side decisions. Phase 2 tests the next maintainer-requested
+question: can T-I-F scores be derived from real agent interaction signals and
+used in a session-scoped decision trace?
+
+## What This Tests
+
+The example uses Dakera's public REST API only:
+
+- `POST /v1/memory/store`
+- `POST /v1/memory/recall`
+- `POST /v1/memories/{memory_id}/feedback`
+- `GET /v1/memories/{memory_id}/feedback`
+- `POST /v1/sessions/start`
+- `GET /v1/sessions/{session_id}/memories`
+- `POST /v1/memories/{memory_id}/links`
+
+The validation remains agent-side. Dakera stores memories, feedback, sessions,
+and links. The local script computes T-I-F from feedback and stores a decision
+trace under `metadata.decision_provenance`.
+
+Dakera `v0.11.90` requires `agent_id` when submitting feedback, reading
+feedback history, and creating memory links. The validator keeps those
+requirements explicit instead of hiding them behind an SDK.
+
+## Feedback-Derived T-I-F Rules
+
+```text
+upvote:   t + 0.10, i - 0.03, f - 0.05
+downvote: t - 0.10, i + 0.05, f + 0.15
+flag:     t - 0.05, i + 0.20, f + 0.10
+```
+
+Scores are clamped to `[0.0, 1.0]`.
+
+Decision priority:
+
+```text
+f >= 0.50 -> surface_contradiction
+i >= 0.50 -> ask_clarification
+t >= 0.70 and i <= 0.35 and f <= 0.35 -> reuse_confidently
+otherwise -> reuse_with_caveat
+```
+
+These thresholds are validation rules only. They are not proposed as Dakera
+engine behavior.
+
+## Scenarios
+
+The fixture covers three developer-recognizable workflows:
+
+| Scenario | Purpose |
+|---|---|
+| `coding-assistant` | feedback corrects an obsolete endpoint decision |
+| `research-agent` | weak-source feedback raises indeterminacy |
+| `customer-support` | outdated policy is surfaced as contradiction evidence |
+
+Each scenario records:
+
+- baseline importance-only decision;
+- feedback-derived T-I-F decision;
+- decision trace memory;
+- session ID;
+- linked evidence memory IDs;
+- associated recall proof.
+
+## Start Dakera
+
+The shared T-I-F compose file defaults to Dakera `v0.11.90`, binds to
+`127.0.0.1`, and disables auth only for local validation. Do not run it on a
+shared or internet-facing host.
+
+```bash
+docker compose -f docker/docker-compose.tif-phase1.yml up -d
+```
+
+Stop:
+
+```bash
+docker compose -f docker/docker-compose.tif-phase1.yml down
+```
+
+## Run Self-Test
+
+```bash
+python examples/tif-provenance/validate_tif_provenance.py --self-test
+```
+
+## Run Runtime Validation
+
+```bash
+python examples/tif-provenance/validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
+```
+
+The script fails if feedback history, session trace storage, or associated
+recall proof is missing.
diff --git a/examples/tif-provenance/VALIDATION_RESULTS.md b/examples/tif-provenance/VALIDATION_RESULTS.md
@@ -0,0 +1,152 @@
+# Phase 2 Validation Results
+
+Date: 2026-06-12 17:07:47 -04:00
+
+Status: passed local runtime validation.
+
+## Target Runtime
+
+```text
+Dakera image: ghcr.io/dakera-ai/dakera:0.11.90
+REST: http://127.0.0.1:3200
+gRPC: 127.0.0.1:51051
+Storage: in-memory
+Auth: disabled for local validation only
+```
+
+The validation compose binds ports to localhost only.
+
+## Commands
+
+```powershell
+python -m py_compile examples\tif-provenance\validate_tif_provenance.py
+python examples\tif-provenance\validate_tif_provenance.py --self-test
+docker compose -f docker\docker-compose.tif-phase1.yml down
+docker compose -f docker\docker-compose.tif-phase1.yml up -d
+python examples\tif-provenance\validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
+docker compose -f docker\docker-compose.tif-phase1.yml down
+```
+
+## Acceptance Criteria
+
+- all three scenarios pass;
+- feedback endpoints accept `upvote`, `downvote`, and `flag`;
+- feedback history is readable;
+- feedback-derived T-I-F changes at least one decision per scenario;
+- decision trace memory is stored with `metadata.decision_provenance`;
+- session memories include the trace and evidence memories;
+- associated recall returns linked evidence or contradiction memories;
+- no engine code is modified;
+- no first-class recall filters are added.
+
+## Result Summary
+
+All three scenarios passed against Dakera `0.11.90`.
+
+Runtime health reported:
+
+```json
+{
+  "ready": true,
+  "version": "0.11.90",
+  "checks": {
+    "embedding_engine": "ok",
+    "storage": "ok",
+    "tiered_engine": "disabled"
+  }
+}
+```
+
+Scenario outcomes:
+
+| Scenario | Baseline action | Feedback-derived T-I-F action | Decision changed | Session proof | Associated recall proof |
+| --- | --- | --- | --- | --- | --- |
+| coding-assistant | `reuse_top_memory` | `surface_contradiction` | yes | yes | yes |
+| research-agent | `reuse_top_memory` | `ask_clarification` | yes | yes | yes |
+| customer-support | `reuse_top_memory` | `surface_contradiction` | yes | yes | yes |
+
+The runtime accepted feedback signals `upvote`, `downvote`, and `flag`; feedback history was readable for every seeded memory; each scenario stored a decision trace with `metadata.decision_provenance`; session memory listing included the trace and evidence memories; associated recall returned linked evidence memories when recalling the decision trace with `include_associated=true` and `associated_memories_depth=1`.
+
+Runtime contract notes observed on Dakera `0.11.90`:
+
+- `POST /v1/sessions/start` returns the session id as `session.id`.
+- `POST /v1/memories/{memory_id}/feedback` requires `agent_id`.
+- `GET /v1/memories/{memory_id}/feedback` requires `agent_id` as a query parameter.
+- `POST /v1/memories/{memory_id}/links` requires `agent_id`.
+
+No engine code was modified. No first-class recall filters were added.
+
+## Review Correction Rerun
+
+Date: 2026-06-12 17:20:58 -04:00
+
+Corrections after fork review:
+
+- healthcheck now requires `ready: true` before runtime validation proceeds;
+- unsupported feedback signals now produce a clear validation error instead of a raw `KeyError`;
+- Phase 1 recall normalization was reviewed and already handles list, dict, and nested `memory` response shapes.
+
+Rerun commands:
+
+```powershell
+python -m py_compile examples\tif-provenance\validate_tif_provenance.py examples\tif-reliability\validate_tif_reliability.py
+python examples\tif-provenance\validate_tif_provenance.py --self-test
+python examples\tif-reliability\validate_tif_reliability.py --self-test
+docker compose -f docker\docker-compose.tif-phase1.yml down
+docker compose -f docker\docker-compose.tif-phase1.yml up -d
+python examples\tif-provenance\validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
+docker compose -f docker\docker-compose.tif-phase1.yml down
+```
+
+Result: passed.
+
+## Codex Review Correction Rerun
+
+Date: 2026-06-12 18:02:19 -04:00
+
+Additional Codex review findings corrected:
+
+- runtime decisions now use the normalized `/v1/memory/recall` response for each scenario query before choosing the baseline and feedback-aware memory;
+- each scenario records `scenario_recall_proof` and the recalled fixture/runtime memory IDs;
+- associated recall proof now verifies that every linked evidence memory appears in the full associated recall response and reports `associated_recall_missing_ids`;
+- runtime validation was rerun with PowerShell preserving the validator exit code before Docker cleanup.
+
+Rerun commands:
+
+```powershell
+python -m py_compile examples\tif-provenance\validate_tif_provenance.py examples\tif-reliability\validate_tif_reliability.py
+python examples\tif-provenance\validate_tif_provenance.py --self-test
+python examples\tif-reliability\validate_tif_reliability.py --self-test
+docker compose -f docker\docker-compose.tif-phase1.yml down
+docker compose -f docker\docker-compose.tif-phase1.yml up -d
+python examples\tif-provenance\validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
+$validationExit = $LASTEXITCODE
+docker compose -f docker\docker-compose.tif-phase1.yml down
+exit $validationExit
+```
+
+Result: passed. All three scenarios returned `scenario_recall_proof: true`, `associated_recall_missing_ids: []`, `associated_recall_proof: true`, and `passed: true`.
+
+## Second Review Correction Rerun
+
+Date: 2026-06-12 17:40:38 -04:00
+
+Additional Qodo findings corrected:
+
+- runtime `changed_decision` now mirrors the self-test logic and treats same-memory `reuse_confidently` as unchanged reuse;
+- runtime memory metadata is deep-copied before adding derived reliability, and malformed or missing `metadata.reliability` now fails with a clear validation error;
+- associated recall keeps a single read-only retry to tolerate cold reranker startup without retrying mutating endpoints.
+
+Rerun commands:
+
+```powershell
+python -m py_compile examples\tif-provenance\validate_tif_provenance.py examples\tif-reliability\validate_tif_reliability.py
+python examples\tif-provenance\validate_tif_provenance.py --self-test
+python examples\tif-reliability\validate_tif_reliability.py --self-test
+docker compose -f docker\docker-compose.tif-phase1.yml down
+docker compose -f docker\docker-compose.tif-phase1.yml up -d
+python examples\tif-provenance\validate_tif_provenance.py --api http://localhost:3200 --request-timeout 240
+docker compose -f docker\docker-compose.tif-phase1.yml down
+```
+
+Result: passed.
diff --git a/examples/tif-provenance/phase2_scenarios.json b/examples/tif-provenance/phase2_scenarios.json
@@ -0,0 +1,128 @@
+{
+  "agent_id": "dakera-tif-phase2",
+  "scenarios": [
+    {
+      "id": "coding-assistant",
+      "title": "Coding assistant review correction",
+      "query": "Which Dakera REST endpoint should the coding assistant use for storing memory with reliability metadata?",
+      "expected_action": "surface_contradiction",
+      "expected_changed_decision": true,
+      "expected_direct_memory": "coding-obsolete-endpoint",
+      "expected_safe_memory": "coding-current-endpoint",
+      "memories": [
+        {
+          "id": "coding-current-endpoint",
+          "content": "Dakera memory store examples should use POST /v1/memory/store for the current public REST API.",
+          "importance": 0.84,
+          "feedback": ["upvote"],
+          "metadata": {
+            "reliability": {
+              "t": 0.66,
+              "i": 0.14,
+              "f": 0.10,
+              "basis": "Phase 1 runtime validation and maintainer review",
+              "source": "phase2_seed"
+            }
+          }
+        },
+        {
+          "id": "coding-obsolete-endpoint",
+          "content": "Dakera examples should use POST /v1/memories when storing agent memories.",
+          "importance": 0.93,
+          "feedback": ["downvote", "downvote"],
+          "metadata": {
+            "reliability": {
+              "t": 0.38,
+              "i": 0.20,
+              "f": 0.34,
+              "basis": "obsolete quickstart assumption superseded by current API behavior",
+              "source": "phase2_seed"
+            }
+          }
+        }
+      ]
+    },
+    {
+      "id": "research-agent",
+      "title": "Research agent source conflict",
+      "query": "Should the research agent cite an unsupported secondary note as confirmed evidence?",
+      "expected_action": "ask_clarification",
+      "expected_changed_decision": true,
+      "expected_direct_memory": "research-uncertain-source",
+      "expected_safe_memory": "research-source-backed",
+      "memories": [
+        {
+          "id": "research-source-backed",
+          "content": "A research agent should prefer source-backed claims and cite the primary evidence when summarizing technical decisions.",
+          "importance": 0.80,
+          "feedback": ["upvote"],
+          "metadata": {
+            "reliability": {
+              "t": 0.68,
+              "i": 0.16,
+              "f": 0.08,
+              "basis": "primary-source research discipline",
+              "source": "phase2_seed"
+            }
+          }
+        },
+        {
+          "id": "research-uncertain-source",
+          "content": "A research agent can treat an uncited secondary note as confirmed evidence when it sounds plausible.",
+          "importance": 0.92,
+          "feedback": ["flag", "flag"],
+          "metadata": {
+            "reliability": {
+              "t": 0.44,
+              "i": 0.18,
+              "f": 0.18,
+              "basis": "weak-source pattern flagged during review",
+              "source": "phase2_seed"
+            }
+          }
+        }
+      ]
+    },
+    {
+      "id": "customer-support",
+      "title": "Customer support outdated policy",
+      "query": "Which customer support policy should the agent reuse when an old process conflicts with the current escalation rule?",
+      "expected_action": "surface_contradiction",
+      "expected_changed_decision": true,
+      "expected_direct_memory": "support-outdated-policy",
+      "expected_safe_memory": "support-current-policy",
+      "memories": [
+        {
+          "id": "support-current-policy",
+          "content": "Customer support agents should follow the current escalation policy and ask for verification when a prior policy conflicts.",
+          "importance": 0.83,
+          "feedback": ["upvote"],
+          "metadata": {
+            "reliability": {
+              "t": 0.67,
+              "i": 0.12,
+              "f": 0.07,
+              "basis": "current support process",
+              "source": "phase2_seed"
+            }
+          }
+        },
+        {
+          "id": "support-outdated-policy",
+          "content": "Customer support agents should always use the old refund process without checking for newer escalation rules.",
+          "importance": 0.91,
+          "feedback": ["downvote", "flag"],
+          "metadata": {
+            "reliability": {
+              "t": 0.42,
+              "i": 0.18,
+              "f": 0.30,
+              "basis": "outdated policy deliberately retained as contradiction evidence",
+              "source": "phase2_seed"
+            }
+          }
+        }
+      ]
+    }
+  ]
+}