TrustCheckAI is an end-to-end bias and compliance auditing, explainability, and model-monitoring platform designed to evaluate bias, mitigate discrimination, explain model decisions, and continuously monitor deployed machine learning systems using Prometheus & Grafana.
It provides real-time dashboards, fairness metrics, model explainability (LIME), drift detection, automated PDF reporting, and user feedback collection β all wrapped in a modern Streamlit UI and containerized for seamless deployment.
- β¨ Features
- π Project Structure
- π§° Technical Stack
- π Supported Datasets
- π System Architecture
- π Compliance, Fairness & Security
- π¨ Human-Centered Design (HCI)
- βοΈ Installation & Setup
- π§ͺ Usage Workflow
- π Prometheus Metrics & Grafana Dashboards
- π Drift Detection
- π PDF Report Generation
- π₯ Demonstration
- π£ Roadmap
- π Citations
- π€ Acknowledgements
- Statistical Parity Difference
- Disparate Impact
- LIME β local explanations per prediction
- KolmogorovβSmirnov (KS) Test
- Logistic Regression
- Random Forest
- 5-fold cross-validation
- Prometheus metric exporter
- Grafana dashboards
- Automated Slack alerts for accuracy/fairness drift
- Full bias report
- Model performance summary
- Clean, intuitive layout
- File upload, analysis, visualization
- User feedback
- Streamlit
- Prometheus
- Grafana
- Docker Compose orchestration
TrustCheckAI/
βββ .ipynb_checkpoints/ # Auto-generated Jupyter checkpoints
βββ __pycache__/ # Python bytecode cache
βββ .DS_Store # macOS system metadata
βββ Dockerfile # Docker build instructions
βββ Final Report.pdf # Final Project Report Template
βββ README.md # Project documentation
βββ TrustCheckAI-Demo.mp4 # Full application demo video
βββ TrustCheckAI-demo.gif # GIF preview for README
βββ compas-scores-two-years.csv # COMPAS dataset for fairness analysis
βββ docker-compose.yml # Multi-service orchestration (Streamlit + Prometheus + Grafana)
βββ feedback.log # Logs for user feedback & events
βββ prometheus.yml # Prometheus scraping config
βββ requirements.txt # Python dependencies
βββ streamlit_app.py # Main Streamlit application
- Python 3.9+
- Scikit-learn
- AIF360
- LIME
- Prometheus
- Grafana
- Streamlit
- Docker
- Docker Compose
- GitHub
Each dataset includes at least one protected attribute such as race, gender, or age that is used for fairness auditing.
The high-level architecture of TrustCheckAI is shown below:
+---------------------------+
| User (UI) |
| β’ Upload CSV dataset |
| β’ Configure analysis |
+-------------+------------+
|
v
+-----------+-----------+
| Streamlit Application |
| β’ Orchestration |
| β’ UX & controls |
+-----------+------------+
|
+-----------------------+------------------------+
| | |
v v v
+----------------+ +-----------------------+ +---------------------+
| Preprocessing | | Bias & Fairness | | Model Training & |
| & Validation |---->| Analysis (AIF360) |-->| Evaluation (SKL) |
| β’ Cleaning | | β’ Metrics & thresholds| | β’ LR / RF |
+----------------+ +-----------------------+ +---------------------+
|
v
+-----------------------------+
| Explainability (LIME) |
+-----------------------------+
|
v
+-----------------------------+
| Drift Detection (KS) |
+-----------------------------+
|
v
+-----------------+-----------------+
| Prometheus Metrics Exporter |
| β’ upload_counter, accuracy_gauge |
+-----------------+-----------------+
|
v
+------------------------+---------------------+
| Grafana Dashboards & Alerts |
| β’ Accuracy / fairness panels |
| β’ Slack / email alerts |
+----------------------------------------------+Component summary:
- Streamlit App β central controller for data upload, analysis steps, and visualization.
- AIF360 Module β computes fairness metrics and applies mitigation algorithms.
- Model Training β trains ML models and logs metrics.
- XAI Module β generates LIME explanations for transparency.
- Drift Detection β monitors changes in data and predictions over time.
- Prometheus & Grafana β collect, visualize, and alert on key metrics.
In TrustCheckAI, the protected attribute is a sensitive feature such as race, gender, age, or ethnicity that represents groups we want to protect from unfair treatment.
Why it is important:
-
π Fairness metrics are defined with respect to protected groups.
Measures like Statistical Parity Difference, Disparate Impact, and Equal Opportunity compare outcomes between protected and nonβprotected groups. Without a protected attribute, these metrics cannot be computed. -
π§ͺ Bias detection requires group-wise comparison.
By conditioning on the protected attribute, TrustCheckAI can reveal whether the model treats one group systematically worse than another (e.g., lower approval rates or higher false-positive rates). -
π‘ Used for auditing, not for discrimination.
In a responsible workflow, the protected attribute is often excluded from the model features used for prediction, but retained in the evaluation pipeline so that fairness can be audited postβhoc. -
π Regulatory and ethical compliance.
Many regulations (EEOC, GDPR βspecial categoriesβ, antiβdiscrimination laws) explicitly refer to protected characteristics. Correctly identifying and handling the protected attribute is essential for demonstrating compliance.
TrustCheckAI makes the protected attribute explicit in the UI and in the generated reports so that stakeholders clearly understand which groups are being evaluated for fairness and how mitigation affects them.
- Regulatory alignment (EEOC, Justice fairness)
- Differential privacy
- Ethical AI lifecycle tracking
- Secure isolated containers
- Accessible charts
- Colorblind-safe design
- Clear fairness/performance separation
- Prototyped user flows
git clone https://github.com/27HarshalPatel/TrustCheckAI.git
cd TrustCheckAI
docker-compose up --buildAccess:
- Streamlit β http://localhost:8501
- Prometheus β http://localhost:9090
- Grafana β http://localhost:3000
- Upload CSV
- Select "Protected" Attribute
- Select "Target" Variable
- Run "Analyze Dataset"
- View the Bias and Compliance Check Result along with Accuracy in Predicting the Results
- View LIME Analyses
- Generate PDF
- Monitor in Grafana
- Receive alerts if Accuracy falls below 70%
- upload_counter
- analysis_counter
- accuracy_gauge
- feedback_ratings_counter
- feedback_comments_counter
- KS Test
Includes fairness metrics and performance summary.
- Fairlearn integration
- Kubernetes deployment
- Extended fairness metrics
- IBM AIF360
- COMPAS Dataset
- University of Florida
- HiPerGator Computing
- Open-source community
