feat: add T-I-F reliability helpers (TifScore, evaluate_tif) by ferhimedamine · Pull Request #119 · Dakera-AI/dakera-rs

ferhimedamine · 2026-06-12T22:55:04Z

Summary

Part of T-I-F RFC Phase 3 — adds type-safe T-I-F reliability helpers.

Changes:

TifScore struct and TifClassification enum in types.rs (derive Serialize, Deserialize, Debug, Clone)
TifScore::from_feedback_history(&FeedbackHistoryResponse) constructor
TifScore::from_metadata(&serde_json::Value) -> Option<TifScore> constructor
classify_tif() private helper
evaluate_tif() async method on DakeraClient in memory.rs
16 inline #[cfg(test)] tests in types.rs
TifScore/TifClassification automatically exported via pub use types::* in lib.rs

🤖 Generated with Claude Code

Reviewed-by: Jean-Sébastien Beaulieu (@SeCuReDmE-main-dev) — T-I-F contract parity review

Implements Phase 3 of the T-I-F RFC (dakera-deploy#161). - Add TifScore struct and TifClassification enum to types.rs with from_feedback_history and from_metadata constructors - Add evaluate_tif() async method to DakeraClient (memory.rs) - 16 inline unit tests covering all classification thresholds and edge cases - TifScore and TifClassification exported via pub use types::* Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Reformat src/types.rs: wrap long json! literal, expand assert_eq! blocks, and break falsity_priority_over_indeterminacy make_history call to satisfy cargo fmt line-length rules - Update CHANGELOG.md and README.md with TifScore / evaluate_tif documentation Part of T-I-F RFC Phase 3 (DAK-6562).

Co-Authored-By: Paperclip <noreply@paperclip.ing>

Wrap long make_history() call and expand assert_eq!(score.classification, ...) to multi-line format matching cargo fmt expectations for two tests: classification_surface_contradiction and falsity_priority_over_indeterminacy. Part of T-I-F RFC Phase 3 (DAK-6562). Co-Authored-By: Paperclip <noreply@paperclip.ing>

- Expand FeedbackHistoryEntry struct init to multi-line (line 2792) - Expand assert_eq!(score.classification, ...) in all_downvotes test - Split 10-element make_history call to stay within line width - Wrap let score = TifScore::from_feedback_history(...) in positive_alias and negative_alias tests All remaining cargo fmt --check diffs applied. Co-Authored-By: Paperclip <noreply@paperclip.ing>

SeCuReDmE-main-dev · 2026-06-12T23:59:51Z

Phase 3 SDK review note from the RFC side.

The Rust shape is good: TifScore, TifClassification, from_feedback_history, from_metadata, and evaluate_tif() match the intended Phase 3 helper layer, and CI is green.

The main thing to resolve before this leaves draft is contract parity. This implementation appears to compute raw feedback proportions, while dakera-mcp#123 currently adds thin-evidence base indeterminacy for histories with fewer than 3 feedback events. No-feedback classification also needs to match MCP and the other SDKs exactly.

I recommend a shared golden-vector contract across MCP + all SDKs. Same input feedback history should always produce the same truth, indeterminacy, falsity, classification, and feedback_count, regardless of language.

Once that math contract is fixed, the Rust implementation looks aligned with Phase 3.

…6566) Aligns Rust SDK with MCP canonical T-I-F v1 contract: - Inject base indeterminacy when feedback_count < 3 to prevent false confidence from sparse signals - Normalise T+I+F to 1.0 after adding base indeterminacy - Add 8 golden vector tests matching MCP/Python/JS/Go - Add 3 thin-evidence unit tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SeCuReDmE-main-dev · 2026-06-13T00:50:38Z

@ferhimedamine final Phase 3 review complete from my side for the Rust SDK PR.

I rechecked the current PR state and the DAK-6566 parity fixes. The previous blockers are resolved:

CI is green.
PR is mergeable.
no-feedback now maps to ask_clarification.
thin-evidence base indeterminacy is present.
the 8 golden vectors are present and cover the canonical T-I-F v1 contract.
3 downvote + 3 flag correctly prioritizes surface_contradiction.
metadata reliability parsing remains backward compatible with Phase 1 / Phase 2.

This is review-ready from my side. No further requested changes from me.

Dakera CTO and others added 3 commits June 12, 2026 23:28

ci: re-trigger CI checks after cargo fmt fix

586846e

Co-Authored-By: Paperclip <noreply@paperclip.ing>

ferhimedamine force-pushed the feat/tif-reliability-helpers branch from 1a3da98 to 586846e Compare June 12, 2026 23:29

Dakera CTO and others added 2 commits June 12, 2026 23:39

Dakera CTO and others added 2 commits June 13, 2026 00:26

style: rustfmt golden vector tests

f12e0c3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ferhimedamine mentioned this pull request Jun 13, 2026

feat(mcp): add dakera_tif_evaluate tool for T-I-F reliability scoring Dakera-AI/dakera-mcp#123

Merged

SeCuReDmE-main-dev mentioned this pull request Jun 13, 2026

RFC: T-I-F (Truth-Indeterminacy-Falsity) Decision Provenance Layer Dakera-AI/dakera-deploy#161

Closed

ferhimedamine marked this pull request as ready for review June 13, 2026 02:46

ferhimedamine added the auto-merge CTO auto-merge approved label Jun 13, 2026

ferhimedamine merged commit 2cfeb0c into main Jun 13, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add T-I-F reliability helpers (TifScore, evaluate_tif)#119

feat: add T-I-F reliability helpers (TifScore, evaluate_tif)#119
ferhimedamine merged 7 commits into
mainfrom
feat/tif-reliability-helpers

ferhimedamine commented Jun 12, 2026 •

edited

Loading

Uh oh!

SeCuReDmE-main-dev commented Jun 12, 2026

Uh oh!

SeCuReDmE-main-dev commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ferhimedamine commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

SeCuReDmE-main-dev commented Jun 12, 2026

Uh oh!

SeCuReDmE-main-dev commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ferhimedamine commented Jun 12, 2026 •

edited

Loading