Skip to content

feat: add T-I-F reliability helpers (TifScore, evaluate_tif)#119

Merged
ferhimedamine merged 7 commits into
mainfrom
feat/tif-reliability-helpers
Jun 13, 2026
Merged

feat: add T-I-F reliability helpers (TifScore, evaluate_tif)#119
ferhimedamine merged 7 commits into
mainfrom
feat/tif-reliability-helpers

Conversation

@ferhimedamine

@ferhimedamine ferhimedamine commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Part of T-I-F RFC Phase 3 — adds type-safe T-I-F reliability helpers.

Changes:

  • TifScore struct and TifClassification enum in types.rs (derive Serialize, Deserialize, Debug, Clone)
  • TifScore::from_feedback_history(&FeedbackHistoryResponse) constructor
  • TifScore::from_metadata(&serde_json::Value) -> Option<TifScore> constructor
  • classify_tif() private helper
  • evaluate_tif() async method on DakeraClient in memory.rs
  • 16 inline #[cfg(test)] tests in types.rs
  • TifScore/TifClassification automatically exported via pub use types::* in lib.rs

🤖 Generated with Claude Code


Reviewed-by: Jean-Sébastien Beaulieu (@SeCuReDmE-main-dev) — T-I-F contract parity review

Dakera CTO and others added 3 commits June 12, 2026 23:28
Implements Phase 3 of the T-I-F RFC (dakera-deploy#161).

- Add TifScore struct and TifClassification enum to types.rs with
  from_feedback_history and from_metadata constructors
- Add evaluate_tif() async method to DakeraClient (memory.rs)
- 16 inline unit tests covering all classification thresholds and edge cases
- TifScore and TifClassification exported via pub use types::*

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Reformat src/types.rs: wrap long json! literal, expand assert_eq! blocks,
  and break falsity_priority_over_indeterminacy make_history call to satisfy
  cargo fmt line-length rules
- Update CHANGELOG.md and README.md with TifScore / evaluate_tif documentation

Part of T-I-F RFC Phase 3 (DAK-6562).
Co-Authored-By: Paperclip <noreply@paperclip.ing>
@ferhimedamine ferhimedamine force-pushed the feat/tif-reliability-helpers branch from 1a3da98 to 586846e Compare June 12, 2026 23:29
Dakera CTO and others added 2 commits June 12, 2026 23:39
Wrap long make_history() call and expand assert_eq!(score.classification, ...)
to multi-line format matching cargo fmt expectations for two tests:
classification_surface_contradiction and falsity_priority_over_indeterminacy.

Part of T-I-F RFC Phase 3 (DAK-6562).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Expand FeedbackHistoryEntry struct init to multi-line (line 2792)
- Expand assert_eq!(score.classification, ...) in all_downvotes test
- Split 10-element make_history call to stay within line width
- Wrap let score = TifScore::from_feedback_history(...) in positive_alias
  and negative_alias tests

All remaining cargo fmt --check diffs applied.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@SeCuReDmE-main-dev

Copy link
Copy Markdown

Phase 3 SDK review note from the RFC side.

The Rust shape is good: TifScore, TifClassification, from_feedback_history, from_metadata, and evaluate_tif() match the intended Phase 3 helper layer, and CI is green.

The main thing to resolve before this leaves draft is contract parity. This implementation appears to compute raw feedback proportions, while dakera-mcp#123 currently adds thin-evidence base indeterminacy for histories with fewer than 3 feedback events. No-feedback classification also needs to match MCP and the other SDKs exactly.

I recommend a shared golden-vector contract across MCP + all SDKs. Same input feedback history should always produce the same truth, indeterminacy, falsity, classification, and feedback_count, regardless of language.

Once that math contract is fixed, the Rust implementation looks aligned with Phase 3.

Dakera CTO and others added 2 commits June 13, 2026 00:26
…6566)

Aligns Rust SDK with MCP canonical T-I-F v1 contract:
- Inject base indeterminacy when feedback_count < 3 to prevent
  false confidence from sparse signals
- Normalise T+I+F to 1.0 after adding base indeterminacy
- Add 8 golden vector tests matching MCP/Python/JS/Go
- Add 3 thin-evidence unit tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@SeCuReDmE-main-dev

Copy link
Copy Markdown

@ferhimedamine final Phase 3 review complete from my side for the Rust SDK PR.

I rechecked the current PR state and the DAK-6566 parity fixes. The previous blockers are resolved:

  • CI is green.
  • PR is mergeable.
  • no-feedback now maps to ask_clarification.
  • thin-evidence base indeterminacy is present.
  • the 8 golden vectors are present and cover the canonical T-I-F v1 contract.
  • 3 downvote + 3 flag correctly prioritizes surface_contradiction.
  • metadata reliability parsing remains backward compatible with Phase 1 / Phase 2.

This is review-ready from my side. No further requested changes from me.

@ferhimedamine ferhimedamine marked this pull request as ready for review June 13, 2026 02:46
@ferhimedamine ferhimedamine added the auto-merge CTO auto-merge approved label Jun 13, 2026
@ferhimedamine ferhimedamine merged commit 2cfeb0c into main Jun 13, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge CTO auto-merge approved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants