Skip to content

feat: add confidence score explanation to output metadata#421

Open
Deepak8858 wants to merge 1 commit intodreamlessx:mainfrom
Deepak8858:fix/issue-416-confidence-breakdown
Open

feat: add confidence score explanation to output metadata#421
Deepak8858 wants to merge 1 commit intodreamlessx:mainfrom
Deepak8858:fix/issue-416-confidence-breakdown

Conversation

@Deepak8858
Copy link
Contributor

This PR addresses #416 by:

  1. Extending the LandmarkDiffPipeline.generate() output to include confidence and confidence_breakdown fields.
  2. Implementing a weighted confidence scoring system based on face detection, identity preservation (ArcFace similarity), landmark accuracy, and mask coverage.
  3. Adding an --explain flag to the CLI to print the confidence breakdown.

Fixes #416

@Deepak8858 Deepak8858 requested a review from dreamlessx as a code owner March 21, 2026 08:04
@dreamlessx
Copy link
Owner

Great to see you back @Deepak8858! Fast turnaround on #416.

The _calculate_confidence() approach is clean — weighted average of face detection, identity preservation, landmark accuracy, and mask coverage. The --explain CLI flag is a nice touch.

A few suggestions:

  1. landmark_accuracy is currently hardcoded to 0.95. For a follow-up, we could compute this from actual NME when ground-truth landmarks are available.
  2. Missing tests — CI will require at least one test for the new method. Something like:
    def test_confidence_breakdown_keys():
        result = pipe.generate(image, procedure="rhinoplasty")
        assert "confidence" in result
        assert set(result["confidence_breakdown"].keys()) == {
            "face_detection", "identity_preservation",
            "landmark_accuracy", "mask_coverage"
        }
  3. CI hasn't triggered — this might need a workflow approval in the Actions tab. I'll check.

Not blocking on the hardcoded value — that's a fine v1. Please add a test and we'll merge once CI passes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add confidence score explanation to output metadata

2 participants