Preliminary screening and triage support utility. Smart Eye does NOT provide a clinical diagnosis. Final medical conclusions remain entirely the responsibility of qualified healthcare professionals.
- Project Overview
- Key Features
- System Architecture
- Technology Stack
- Repository Structure
- Setup & Installation
- Running the Application
- Model Training
- API Reference
- Test Suite
- Design Decisions & Limitations
- Performance Benchmarks
Smart Eye is a full-stack, privacy-first ocular health screening platform built as a university capstone project. It combines three independent AI/computational pipelines into a single composite risk score called the Ocular Health Index (OHI):
| Pipeline | Technology | Purpose |
|---|---|---|
| Fundus disease screening | ResNet-50 + VGG-16 hybrid CNN | Classify retinal images into 4 disease classes |
| Anterior-segment screening | EfficientNetB0 | Classify corneal/anterior images into 4 classes |
| Real-time fatigue monitoring | dlib 68-point landmarks + Eye Aspect Ratio (EAR) | Detect drowsiness and blink rate from webcam |
These three signals are fused by a Mamdani fuzzy inference system (pure NumPy, no external fuzzy library) to produce the OHI. A deterministic recommendation engine then maps the OHI band to plain-language clinical guidance, capped at two actions to prevent alert fatigue.
The platform is accessible via a React/Vite frontend (port 5174) backed by a FastAPI REST API (port 8000), with SQLite session persistence and optional Google OAuth authentication.
- Dual vision workflows — fundus (posterior-segment) and anterior-segment classifiers run as independent, parallel pipelines sharing the same orchestration, fuzzy engine, persistence, and PDF export layer.
- Grad-CAM explainability — both CNNs produce a jet-coloured heatmap overlay showing which image regions drove the prediction, returned as a base64 PNG data URI alongside the prediction JSON.
- Mamdani fuzzy risk fusion — confidence, fatigue score, and symptom intensity are fused using triangular/trapezoidal membership functions, min-implication, max-aggregation, and centroid defuzzification — all implemented transparently in NumPy.
- Real-time fatigue monitor — EAR computed from 68 facial landmarks at ~8 Hz; blink detection uses a 3-frame smoothed EAR with adaptive threshold; drowsiness triggers after 2.5 s of sustained closure.
- Privacy-first design — raw video never leaves the device; only landmark coordinates are sent to the API.
- Mock-model fallback — if no trained weights are present, a clearly-labelled deterministic mock model activates. The UI shows a loud "MOCK MODEL" banner so screening output can never be mistaken for a real result.
- Stratified K-Fold training — the anterior trainer uses 5-fold cross-validation with per-fold balanced class weights (scikit-learn), giving an honest mean validation accuracy across imbalanced clinical data.
- Session history & PDF export — signed-in users get persistent session history and downloadable PDF reports; guests see only their live result.
- Bilingual UI — full English/Bengali (বাংলা) internationalisation via a custom i18n hook.
┌─────────────────────────────────────────────────────────┐
│ React / Vite Frontend │
│ localhost:5174 (port 5174) │
│ Home · Screening · Fatigue · History · Analytics │
└──────────────────────┬──────────────────────────────────┘
│ HTTP / REST (Vite proxy → :8000)
┌──────────────────────▼──────────────────────────────────┐
│ FastAPI Orchestration Layer │
│ localhost:8000 │
│ /api/* (fundus) │ /api/anterior/* (anterior) │
└──────┬───────────────┴──────────┬──────────────────────┘
│ │
┌──────▼──────────┐ ┌──────────▼──────────┐
│ Fundus Domain │ │ Anterior Domain │
│ ResNet50+VGG16 │ │ EfficientNetB0 │
│ Grad-CAM │ │ Grad-CAM │
└──────┬──────────┘ └──────────┬──────────┘
│ │
└──────────┬───────────────┘
│ DiseasePrediction (shared schema)
┌──────────▼────────────────┐
│ ScreeningOrchestrator │
│ MamdaniFuzzyEngine │ ← pure NumPy
│ RecommendationEngine │
└──────────┬────────────────┘
│ SessionSummary
┌──────────▼────────────────┐
│ SQLite Persistence │
│ PDF Report Generator │
│ Auth (email + Google) │
└───────────────────────────┘
- User uploads a fundus or anterior image and fills in a symptom questionnaire (1–5 Likert scale).
- Simultaneously, the fatigue monitor accumulates per-frame EAR values from the webcam.
- The API decodes the image, runs
model.predict(image)→DiseasePrediction. - The orchestrator calls
fuzzy.infer(confidence, fatigue_score, symptom_aggregate)→OHIResult. - The recommendation engine maps
(ohi_band, top_class, fatigue)→Recommendation(≤ 2 actions). - A
SessionSummaryis returned to the frontend and optionally persisted for signed-in users.
| Layer | Technology | Version |
|---|---|---|
| Language | Python | 3.9.6 |
| Deep learning | TensorFlow / Keras | 2.13.0 / 2.13.1 |
| API framework | FastAPI | 0.100.0 |
| Data validation | Pydantic | 1.10.26 (v1) |
| ASGI server | Uvicorn | 0.29.0 |
| Numerical compute | NumPy | 1.24.3 |
| Image processing | Pillow | 11.3.0 |
| K-fold / class weights | scikit-learn | ≥ 1.3, < 1.4 |
| HTTP client (tests) | httpx | 0.28.1 |
| Frontend | React + Vite | 18 / 4 |
| Persistence | SQLite (stdlib) | — |
| Auth | email/password + Google OAuth | — |
smart_eye_project 2/
├── smart_eye/ # Python package (backend)
│ ├── config.py # All constants: classes, input sizes, budgets, paths
│ ├── schemas.py # Pydantic data contracts (shared across layers)
│ ├── auth.py # Email/password + Google OAuth
│ ├── domain/
│ │ ├── disease_screening.py # Fundus CNN: ResNet50 + VGG16 hybrid
│ │ ├── anterior_screening.py # Anterior CNN: EfficientNetB0
│ │ ├── gradcam.py # Grad-CAM for the fundus hybrid
│ │ ├── anterior_gradcam.py # Grad-CAM for the anterior EfficientNetB0
│ │ ├── fuzzy_risk_engine.py # Mamdani fuzzy inference (pure NumPy)
│ │ ├── fatigue_monitor.py # EAR + blink/drowsiness state machine
│ │ ├── recommendations.py # Deterministic recommendation engine
│ │ └── report_generator.py # PDF session report
│ ├── orchestration/
│ │ ├── api.py # FastAPI app (fundus routes)
│ │ ├── anterior_api.py # FastAPI router (/api/anterior/* routes)
│ │ └── orchestrator.py # ScreeningOrchestrator (model-agnostic)
│ ├── persistence/
│ │ └── store.py # SQLite session CRUD
│ ├── models/ # Trained weights (not committed to git)
│ │ ├── hybrid_cnn_native.h5 # Fundus model weights
│ │ └── anterior_efficientnet.h5 # Anterior model weights
│ └── tests/
│ ├── test_core.py # Unit + integration tests (T01–T07)
│ ├── test_performance.py # Latency budget tests (T08)
│ └── test_anterior.py # Anterior workflow tests (TA1–TA7)
├── frontend/ # React / Vite SPA
│ └── src/
│ ├── pages/ # Home, Screening, Fatigue, History, Analytics
│ └── components.jsx # Shared UI components
├── dataset/ # Fundus training images (not committed)
│ ├── train/
│ └── validation/
├── dataset_anterior/ # Anterior training images (not committed)
│ ├── train/
│ └── validation/
├── train_hybrid.py # Fundus model training script
├── train_anterior.py # Anterior model training script (K-fold)
├── requirements.txt # Python dependencies
└── README.md # This file
- Python 3.9.x (system or user install — no virtual environment required, though one is recommended)
- Node.js ≥ 18 and npm
- macOS / Linux (tested on macOS arm64)
git clone https://github.com/emon22-ts/smart-eye.git
cd "smart_eye_project 2"pip install fastapi==0.100.0 uvicorn==0.29.0 python-multipart==0.0.20 \
pillow numpy tensorflow==2.13.0 pydantic==1.10.26 httpx \
"scikit-learn>=1.3,<1.4"Or from the requirements file:
pip install -r requirements.txtcd frontend
npm install
cd ..The model weights are stored in Git LFS and are not bundled in the repository zip. After cloning, pull them with:
git lfs pullThis downloads smart_eye/models/hybrid_cnn_native.h5 (fundus) and smart_eye/models/anterior_efficientnet.h5 (anterior) into place. If LFS is unavailable, train from scratch using the scripts in Section 8.
Open two terminals from the project root.
Terminal 1 — Backend API (port 8000):
uvicorn smart_eye.orchestration.api:app --reloadOn startup you will see either:
Loaded REAL native model from .../hybrid_cnn_native.h5— trained model active.No native model found; using MOCK MODEL.— mock fallback active; the UI will show a warning banner.
Terminal 2 — Frontend dev server (port 5174):
cd frontend
npm run devOpen http://localhost:5174 in your browser.
The Vite dev server proxies all /api requests to http://localhost:8000, so no CORS configuration is needed during development.
API-only demo page (no frontend required):
http://localhost:8000
Dataset: Guna Venkat Doddi "Eye Diseases Classification" — 4 classes: Normal, Cataract, Glaucoma, Diabetic_Retinopathy.
Arrange images as:
dataset/
train/
Normal/ Cataract/ Glaucoma/ Diabetic_Retinopathy/
validation/
Normal/ Cataract/ Glaucoma/ Diabetic_Retinopathy/
Then run:
python train_hybrid.pyThis trains for up to 25 epochs with early stopping (patience 6), saves the best checkpoint to smart_eye/models/hybrid_cnn.h5, and prints a per-class classification report if scikit-learn is available. Achieved validation accuracy: 88.3%.
Classes: Normal, Cataract, Keratitis, Corneal_Scar.
Arrange images as:
dataset_anterior/
train/
Normal/ Cataract/ Keratitis/ Corneal_Scar/
validation/ ← optional held-out check
Normal/ Cataract/ Keratitis/ Corneal_Scar/
Then run:
pip install "scikit-learn>=1.3,<1.4" # required for K-fold + class weights
python train_anterior.pyThis performs 5-fold stratified cross-validation with per-fold balanced class weights, an optional backbone fine-tuning phase (unfreezing all layers except BatchNorm at 1e-5 LR), and saves the best model across all folds to smart_eye/models/anterior_efficientnet.h5.
Key design differences from the fundus trainer:
| Fundus | Anterior | |
|---|---|---|
| Architecture | ResNet-50 + VGG-16 (late fusion) | EfficientNetB0 (single branch) |
| Pixel scaling | rescale=1./255 |
None — EfficientNet self-normalises in [0, 255] |
| Validation strategy | Single train/val split | Stratified 5-fold CV |
| Class imbalance | None applied | Balanced class weights per fold |
| Grad-CAM target layer | conv5_block3_out (ResNet branch) |
top_activation (top-level) |
Important: The
[0, 255]pixel range in the anterior pipeline is intentional and load-bearing. Do not add a/255rescale — accuracy will silently collapse at inference time.
All endpoints are served at http://localhost:8000. Interactive docs are available at http://localhost:8000/docs (Swagger UI).
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/health |
Server status, active model ID, budget constants |
POST |
/api/screen/image |
Classify a fundus image → DiseasePrediction |
POST |
/api/screen/image/explain |
Classify + return Grad-CAM overlay (base64 PNG) |
POST |
/api/fatigue/frame |
Advance fatigue monitor with 68 facial landmarks |
POST |
/api/session/score |
Full fusion: image + symptoms + fatigue → SessionSummary |
GET |
/api/sessions |
List session history (signed-in users only) |
GET |
/api/sessions/{id} |
Retrieve one session |
DELETE |
/api/sessions/{id} |
Delete one session |
GET |
/api/sessions/{id}/pdf |
Download PDF report |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/anterior/health |
Anterior model status and class list |
POST |
/api/anterior/screen/image |
Classify an anterior image → DiseasePrediction |
POST |
/api/anterior/screen/image/explain |
Classify + Grad-CAM overlay |
POST |
/api/anterior/session/score |
Full fusion (anterior model, shared fuzzy/recommender) |
DiseasePrediction
{
"probabilities": { "Normal": 0.87, "Cataract": 0.08, "Glaucoma": 0.03, "Diabetic_Retinopathy": 0.02 },
"top_class": "Normal",
"top_confidence": 0.87,
"is_mock": false,
"model_id": "hybrid-cnn-native"
}SessionSummary (from /api/session/score)
{
"disease": { "...DiseasePrediction..." },
"fatigue": { "ear": 0.31, "blink_rate_bpm": 16.2, "drowsy": false, "fatigue_score": 12.0, "face_detected": true },
"symptoms_aggregate": 1.5,
"ohi": { "ohi": 84.0, "risk_index": 16.0, "band": "Low", "colour": "green", "rule_activations": ["..."] },
"recommendation": { "actions": ["No urgent action indicated. Continue routine eye care."], "urgency": "routine", "referral_flag": false, "disclaimer": "..." },
"latency_ms": 78.4
}Endpoints that persist or retrieve session data require an Authorization header:
Authorization: Bearer <token>
Guest users (no header) receive live results but no saved history.
Tests are written with pytest and live in smart_eye/tests/. Run all tests from the project root:
pytest -vRun only the anterior tests:
pytest smart_eye/tests/test_anterior.py -v| ID | File | What it tests |
|---|---|---|
| T01 | test_core.py |
Mock disease model: all classes, sums to 1, deterministic, is_mock honest |
| T02 | test_core.py |
EAR geometry: known coordinates, closed-eye threshold, degenerate horizontal |
| T03 | test_core.py |
Fuzzy engine: boundary OHI in [0, 100], healthy inputs → Low band, severe → High band |
| T04 | test_core.py |
Orchestrator end-to-end: SessionSummary produced with correct band and mock flag |
| T05 | test_core.py |
Drowsiness state machine: triggers after 2.5 s, face-lost handled, blink counted |
| T06 | test_core.py |
Recommendations: capped at 2 actions, disclaimer always attached |
| T07 | test_core.py |
Preprocessing: grayscale/RGBA coercion, output shape (1, 224, 224, 3) |
| T08 | test_performance.py |
Inference latency < 2.0 s; frame processing < 50 ms (skips if mock) |
| TA1 | test_anterior.py |
Anterior mock: 4 classes, sums to 1, deterministic, is_mock honest |
| TA2 | test_anterior.py |
Anterior preprocessing: [0, 255] range kept (not divided by 255) |
| TA3 | test_anterior.py |
Anterior inference latency within budget (mock + real model paths) |
| TA4 | test_anterior.py |
Extreme fuzzy inputs produce finite OHI; degenerate images don't crash |
| TA5 | test_anterior.py |
Grad-CAM: finite, normalised heatmap; epsilon guard prevents NaN |
| TA6 | test_anterior.py |
Anterior model flows through the SHARED orchestrator + fuzzy engine |
| TA7 | test_anterior.py |
HTTP round-trip against /api/anterior/screen/image via httpx ASGI transport |
TA5 and TA7 require
tensorflowandhttpxrespectively; theypytest.skipcleanly if those packages are absent.
The fundus (posterior-segment) and anterior-segment taxonomies use different training datasets, different backbones, and — critically — different pixel scaling conventions. Merging them into one model would silently break one of the two inference paths. Running them as independent routers sharing the same orchestrator and fuzzy engine gives full isolation with no code duplication in the business logic.
scikit-fuzzy's control API is well-known but is a black box from an assessment perspective. Implementing Mamdani inference manually in NumPy makes every membership function, rule firing, implication, aggregation, and defuzzification step directly inspectable and testable — which is important for a project that is assessed for technical depth.
The anterior dataset is smaller and potentially more imbalanced than the fundus dataset. A single train/val split risks a lucky or unlucky partition; 5-fold stratified CV provides an honest mean ± std accuracy that is more defensible in an assessment context.
- The
is_mockflag must always be checked before displaying predictions. The mock model is deterministic and hash-seeded but carries no clinical meaning. - Grad-CAM is an approximation for the fundus hybrid (single-branch explanation of a fused model); it is exact for the single-branch EfficientNetB0.
- The platform is a triage support tool, not a diagnostic device. It has not been validated in a clinical setting.
- Real-time fatigue monitoring requires a webcam and a browser that supports the MediaDevices API.
Measured on Apple Silicon (M-series), macOS arm64, Python 3.9.6, TensorFlow 2.13.
| Metric | Value | Budget |
|---|---|---|
| Fundus CNN validation accuracy | 88.3% | ≥ 85% |
| Anterior CNN mean val accuracy (5-fold) | ≥ 88% (data-dependent) | ≥ 88% |
| Single-image inference latency | ~80 ms | < 2000 ms |
| Fatigue frame processing latency | < 1 ms | < 50 ms |
| Landmark cadence (webcam) | ~8 Hz | — |
| Fuzzy inference (NumPy centroid) | < 5 ms | — |
Smart Eye is a preliminary screening and triage support utility. It does NOT provide a clinical diagnosis. Final medical conclusions remain entirely the responsibility of qualified healthcare professionals.
- Dataset: Eye Diseases Classification — Guna Venkat Doddi
- Source: https://www.kaggle.com/datasets/gunavenkatdoddi/eye-diseases-classification
- Classes: Normal, Cataract, Glaucoma, Diabetic_Retinopathy (~1,000 images per class, 4,217 total)
- Place images in:
dataset/train/<class>/anddataset/validation/<class>/
- Dataset: Assembled from publicly available slit-lamp and anterior segment image sources
- Classes: Normal, Cataract, Keratitis, Corneal_Scar
- Place images in:
dataset_anterior/train/<class>/anddataset_anterior/validation/<class>/
Note: Datasets are not included in this repository due to size. Download them separately and place them in the correct folders before running the training scripts.