feat: add T-I-F reliability helpers (TifScore, evaluate_tif)#119
Conversation
Implements Phase 3 of the T-I-F RFC (dakera-deploy#161). - Add TifScore struct and TifClassification enum to types.rs with from_feedback_history and from_metadata constructors - Add evaluate_tif() async method to DakeraClient (memory.rs) - 16 inline unit tests covering all classification thresholds and edge cases - TifScore and TifClassification exported via pub use types::* Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Reformat src/types.rs: wrap long json! literal, expand assert_eq! blocks, and break falsity_priority_over_indeterminacy make_history call to satisfy cargo fmt line-length rules - Update CHANGELOG.md and README.md with TifScore / evaluate_tif documentation Part of T-I-F RFC Phase 3 (DAK-6562).
Co-Authored-By: Paperclip <noreply@paperclip.ing>
1a3da98 to
586846e
Compare
Wrap long make_history() call and expand assert_eq!(score.classification, ...) to multi-line format matching cargo fmt expectations for two tests: classification_surface_contradiction and falsity_priority_over_indeterminacy. Part of T-I-F RFC Phase 3 (DAK-6562). Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Expand FeedbackHistoryEntry struct init to multi-line (line 2792) - Expand assert_eq!(score.classification, ...) in all_downvotes test - Split 10-element make_history call to stay within line width - Wrap let score = TifScore::from_feedback_history(...) in positive_alias and negative_alias tests All remaining cargo fmt --check diffs applied. Co-Authored-By: Paperclip <noreply@paperclip.ing>
|
Phase 3 SDK review note from the RFC side. The Rust shape is good: The main thing to resolve before this leaves draft is contract parity. This implementation appears to compute raw feedback proportions, while I recommend a shared golden-vector contract across MCP + all SDKs. Same input feedback history should always produce the same Once that math contract is fixed, the Rust implementation looks aligned with Phase 3. |
…6566) Aligns Rust SDK with MCP canonical T-I-F v1 contract: - Inject base indeterminacy when feedback_count < 3 to prevent false confidence from sparse signals - Normalise T+I+F to 1.0 after adding base indeterminacy - Add 8 golden vector tests matching MCP/Python/JS/Go - Add 3 thin-evidence unit tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@ferhimedamine final Phase 3 review complete from my side for the Rust SDK PR. I rechecked the current PR state and the DAK-6566 parity fixes. The previous blockers are resolved:
This is review-ready from my side. No further requested changes from me. |
Summary
Part of T-I-F RFC Phase 3 — adds type-safe T-I-F reliability helpers.
Changes:
TifScorestruct andTifClassificationenum intypes.rs(derive Serialize, Deserialize, Debug, Clone)TifScore::from_feedback_history(&FeedbackHistoryResponse)constructorTifScore::from_metadata(&serde_json::Value) -> Option<TifScore>constructorclassify_tif()private helperevaluate_tif()async method onDakeraClientinmemory.rs#[cfg(test)]tests intypes.rsTifScore/TifClassificationautomatically exported viapub use types::*inlib.rs🤖 Generated with Claude Code
Reviewed-by: Jean-Sébastien Beaulieu (@SeCuReDmE-main-dev) — T-I-F contract parity review