A Hybrid CNN–LLM Pipeline for Chest X-Ray Classification and Radiology Report Generation
Academic Project — Developed as part of the AI for Healthcare course in the Master of Science in Artificial Intelligence (MSAI) program at the University of Texas at Austin.
- Screenshots
- Overview
- Research Question
- Pipeline
- Target Classes
- Tech Stack
- Dataset
- Getting Started
- Usage
- API Endpoints
- Project Structure
- Evaluation & Charts
- Research Paper
- License
- Acknowledgments
ChestXpert is a hybrid AI tool that combines a locally fine-tuned CNN (EfficientNetB0) with GPT-5.4 (Azure OpenAI) to classify chest X-ray images and generate radiology-style reports. The system runs a three-stage pipeline:
- Stage A — Local CNN: Classifies the X-ray with confidence scores and produces a Grad-CAM explainability heatmap.
- Stage B — GPT-5.4 Blind Analysis: Sends only the raw image to GPT-5.4 for an independent classification and radiology findings — no CNN results are shared (blind evaluation).
- Stage C — Merged Conclusion: GPT-5.4 receives both analyses and the Grad-CAM heatmap, then synthesizes a final radiology-style report with agreement analysis.
Do CNN attention regions and LLM visual analysis converge on clinically relevant features in chest radiographs, and does combining a fine-tuned specialist model with a generalist LLM improve diagnostic confidence?
User uploads X-ray image
│
▼
┌─────────────────────────────────────────────┐
│ STAGE A — Local CNN (EfficientNetB0) │
│ • Predicted class + confidence scores │
│ • Grad-CAM heatmap │
└─────────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ STAGE B — GPT-5.4 Blind Analysis │
│ • Raw image only (no CNN results) │
│ • Classification + radiology findings │
└─────────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ STAGE C — GPT-5.4 Merged Conclusion │
│ • Synthesizes CNN + GPT blind analyses │
│ • Agreement/disagreement assessment │
│ • Final radiology-style report │
└─────────────────────────────────────────────┘
| Class | Description |
|---|---|
| COVID-19 | COVID-19 positive chest X-rays |
| Normal | Healthy lung radiographs |
| Lung Opacity | Non-COVID lung opacities |
| Viral Pneumonia | Non-COVID viral pneumonia |
| Layer | Technology |
|---|---|
| Frontend | React 19 · TypeScript · Vite · Tailwind CSS |
| Backend | Python · FastAPI · Uvicorn |
| CNN Model | TensorFlow / Keras · EfficientNetB0 (transfer learning) |
| Explainability | Grad-CAM |
| LLM | GPT-5.4 via Azure OpenAI (vision endpoint) |
| Auth | Azure DefaultAzureCredential (no API keys in config) |
This project uses the COVID-19 Radiography Database from Kaggle:
🔗 https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database
- ~21,165 PNG images (299×299) across 4 classes
- The dataset is not included in this repository. Download it from the link above and extract it locally.
Expected folder structure after extraction:
COVID_Dataset/
├── COVID/images/
├── Lung_Opacity/images/
├── Normal/images/
└── Viral Pneumonia/images/ ← note the space in the folder name
Place the dataset in the project root as ./COVID_Dataset (default), or set a custom path via the DATASET_PATH variable in your .env file.
- Python 3.10+
- Node.js 18+
- Azure OpenAI access with a GPT-5.4 (vision-capable) deployment
- Azure CLI — logged in via
az login(used for authentication)
git clone https://github.com/<your-username>/ChestXpert-App.git
cd ChestXpert-App# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # macOS / Linux
# venv\Scripts\activate # Windows
# Install dependencies
pip install -r backend/requirements.txtcp .env.example .envEdit .env and fill in your Azure OpenAI values:
AZURE_OPENAI_BASE_URL=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=gpt-5.4
AZURE_OPENAI_API_VERSION=2024-12-01-previewNote: Authentication uses
DefaultAzureCredential— make sure you're logged in withaz login. No API key is needed in the.envfile.
Download the COVID-19 Radiography Database and extract it to ./COVID_Dataset (or update DATASET_PATH in .env).
Run from the project root:
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000The API will be available at http://localhost:8000 — interactive docs at http://localhost:8000/docs.
cd frontend
npm install
npm run devThe app will be available at http://localhost:5173.
The application has three main tabs:
- Enter the dataset path (default:
./COVID_Dataset) - Adjust hyperparameters if desired (epochs, batch size, learning rate)
- Click Start Training and monitor live progress
- Training curves appear after training completes
- Upload a chest X-ray image (PNG or JPEG)
- Click Analyze
- Results appear in three panels:
- CNN Results — predicted class, confidence scores, Grad-CAM heatmap
- GPT-5.4 Blind — classification and radiology findings (no CNN input)
- Merged Conclusion — synthesized final report
- Set the GPT sample size (default: 200, stratified 50 per class)
- Click Run Evaluation
- View confusion matrices, per-class metrics, CNN vs GPT comparison charts, ROC curves, and the Grad-CAM grid
- Export all charts as a ZIP for paper figures
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/train |
Start model training |
GET |
/api/train/status |
Poll training progress |
POST |
/api/predict |
Upload image and run full pipeline |
GET |
/api/model/info |
Current model metadata |
POST |
/api/evaluate |
Run full test-set evaluation |
GET |
/api/evaluate/status |
Poll evaluation progress |
GET |
/api/evaluate/results |
Fetch latest evaluation results and charts |
ChestXpert-App/
├── backend/
│ ├── main.py # FastAPI entry point
│ ├── config.py # pydantic-settings configuration
│ ├── requirements.txt # Python dependencies
│ ├── cnn/
│ │ ├── model.py # EfficientNetB0 architecture
│ │ ├── dataset.py # Data loading, augmentation, splits
│ │ ├── train.py # Training loop with callbacks
│ │ ├── predict.py # Single-image inference
│ │ └── gradcam.py # Grad-CAM heatmap generation
│ ├── llm/
│ │ ├── client.py # Azure OpenAI GPT-5.4 client
│ │ ├── prompts.py # Stage B & C prompt templates
│ │ └── parser.py # GPT response parsing
│ ├── pipeline/
│ │ ├── orchestrator.py # Stage A → B → C orchestration
│ │ └── schemas.py # Pydantic I/O models
│ ├── evaluation/
│ │ ├── evaluate.py # CNN test-set metrics
│ │ ├── batch_gpt.py # GPT-5.4 batch evaluation
│ │ ├── compare.py # CNN vs GPT comparison
│ │ └── charts.py # Matplotlib/seaborn chart generation
│ ├── routers/ # FastAPI route handlers
│ └── models/ # Saved model weights (gitignored)
├── frontend/
│ ├── src/
│ │ ├── App.tsx
│ │ ├── components/ # React UI components
│ │ ├── api/client.ts # Backend API wrapper
│ │ └── types/index.ts # TypeScript interfaces
│ ├── package.json
│ └── vite.config.ts
├── docs/
│ ├── architecture/ # Architecture diagram
│ └── figures/ # Evaluation chart images
├── paper/ # Research paper (PDF)
├── .env.example # Environment variable template
├── .gitignore
├── LICENSE # MIT License
└── README.md # This file
The evaluation tab generates publication-quality charts (300 DPI), including:
| Chart | Description |
|---|---|
| Training curves | Loss + accuracy over epochs (train vs. validation) |
| CNN confusion matrix | Heatmap with counts and percentages |
| GPT confusion matrix | Same format, GPT-5.4 predictions |
| Per-class metrics | Grouped bars: precision, recall, F1 per class |
| CNN vs GPT comparison | Side-by-side accuracy and F1 |
| Agreement pie chart | CNN–GPT agreement/disagreement rate |
| Confidence distribution | CNN softmax vs GPT confidence histogram |
| ROC curves | One-vs-rest per class with AUC |
| Grad-CAM grid | Sample images with heatmap overlays per class |
All charts can be exported as a ZIP for inclusion in academic papers.
The accompanying ACM-style research paper is available in the paper/ directory. It covers the methodology, experimental results, and comparison with prior work on the same dataset.
This project is licensed under the MIT License.
- University of Texas at Austin — MSAI program, AI for Healthcare course
- COVID-19 Radiography Database — Kaggle dataset by Tawsifur Rahman et al.
- EfficientNet — Tan & Le (2019), pre-trained weights via TensorFlow/Keras
- Azure OpenAI — GPT-5.4 vision capabilities for radiology analysis



