Add healthcare tutorial by smithjilks · Pull Request #43 · ultravioletrs/ai

smithjilks · 2026-03-26T08:57:07Z

What type of PR is this?

What does this do?

Which issue(s) does this PR fix/relate to?

Have you included tests for your changes?

Did you document any new/modified features?

Notes

fbugarski · 2026-04-01T15:58:32Z

healthcare/README.md

+- **Secure Computation (aTLS)** — Attested TLS verifies the TEE hardware and software stack before any patient data is uploaded
+- **Multi-Party Computation** — Three independent hospitals each upload proprietary EHR datasets into the same encrypted enclave
+- **Real-World Data** — Uses the [UCI Diabetes 130-US Hospitals](https://www.kaggle.com/datasets/jimschacko/10-years-diabetes-dataset) dataset (~100K real patient encounters) split across simulated hospitals
+- **Healthcare Value** — Benchmark proves the consortium model outperforms any single-hospital model at predicting 30-day readmissions


The README states that the consortium model outperforms any single-hospital model, but this may not always be the case depending on the dataset and evaluation setup.

It might be better to phrase this as a comparison rather than a guaranteed improvement.

fbugarski · 2026-04-01T16:00:47Z

healthcare/predict.py

+Healthcare - Multi-Hospital Patient Readmission Prediction (Inference / Analysis)
+
+Loads the trained consortium readmission model and produces evaluation
+metrics, visualizations, and a summary report demonstrating the value of


This wording suggests that the consortium approach provides better results than single-hospital models, which may not always be the case depending on the dataset and evaluation setup.

It may be better to phrase this more neutrally.

fbugarski · 2026-04-01T16:01:46Z

healthcare/train.py

+    print("=" * 60)
+    X = combined[feature_cols].values
+    y = combined["readmitted_30d"].values
+    X_train, X_test, y_train, y_test = train_test_split(


This example uses a random train/test split for a healthcare prediction task. Since the dataset spans multiple years, a time-based split may provide a more realistic evaluation and reduce the risk of overly optimistic results.

fbugarski · 2026-04-01T16:05:13Z

healthcare/train.py

+            df[col] = df[col].map(med_map).fillna(0).astype(int)
+    df["num_med_changes"] = df[[c for c in MEDICATION_COLS if c in df.columns]].sum(axis=1)
+    df["total_visits"] = df["number_outpatient"] + df["number_emergency"] + df["number_inpatient"]
+    for col in CATEGORICAL_COLS:


Label encoding is applied independently to each dataset before concatenation.

This may result in inconsistent encodings for the same categorical values across hospitals, which can negatively affect the model.

It would be better to ensure a consistent mapping across all datasets.

fbugarski · 2026-04-01T16:05:47Z

healthcare/train.py

+            Xh_tr, yh_tr, test_size=0.15, random_state=42, stratify=yh_tr)
+        ind_model = train_model(Xh_tr, yh_tr, Xh_va, yh_va,
+                                feature_names=feature_cols)
+        m = evaluate_model(ind_model, Xh_te, yh_te,


The evaluation setup is not directly comparable:

The consortium model is evaluated on both a global test split and full hospital datasets

Individual models are evaluated on their own test splits

This makes the benchmark difficult to interpret. It would be better to evaluate all models on the same holdout data.

fbugarski · 2026-04-01T16:06:10Z

healthcare/predict.py

+            improvement = ((consortium_auc[0] - avg_solo) / max(abs(avg_solo), 0.01)) * 100
+            print(f"\n  Consortium AUC : {consortium_auc[0]:.4f}")
+            print(f"  Avg Solo AUC   : {avg_solo:.4f}")
+            print(f"  Improvement    : {improvement:+.1f}%")


The "improvement" metric can be negative depending on the results, which may contradict the narrative that the consortium model performs better.

It may be helpful to clarify this or avoid framing it as an improvement.

Add healthcare tutorial

2c09dc7

smithjilks requested a review from SammyOina March 26, 2026 08:57

smithjilks self-assigned this Mar 26, 2026

smithjilks added 2 commits March 27, 2026 12:56

Fix healthcare tutorial

4ebf8df

Fix healthcare tutorial

0d4ac9e

fbugarski requested changes Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add healthcare tutorial#43

Add healthcare tutorial#43
smithjilks wants to merge 3 commits intoultravioletrs:mainfrom
smithjilks:feat-tutorial-healthcare

smithjilks commented Mar 26, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

fbugarski Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

smithjilks commented Mar 26, 2026

What type of PR is this?

What does this do?

Which issue(s) does this PR fix/relate to?

Have you included tests for your changes?

Did you document any new/modified features?

Notes

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbugarski Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants