Skip to content

cavalab/ESI

Repository files navigation

ESI analysis

Analysis of racial disparities in Emergency Severity Index (ESI) triage decisions using propensity score matching and high-risk symptom detection.

Overview

This repository analyzes emergency department triage data and computationally implements the ESI algorithm to identify potential racial disparities in ESI triage assignments. The analysis combines:

  • High-risk symptom detection from patient complaints
  • Danger zone vital signs identification
  • Propensity score matching to control for confounding variables
  • Statistical analysis of odds ratios across racial groups

Repository Structure

ESI/
├── binarization_code/          # Data binarization scripts per center
├── src/                        # Core analysis modules
│   ├── high_risk_dictionary.py # High-risk symptom detection functions
│   ├── vital_signs.py          # Danger zone vital signs analysis
│   └── propensity_score_matching.py # PSM analysis, odds ratio calculations, and plotting
├── center_configs.json         # Hospital-specific configuration variables
├── notebooks/                  # Jupyter notebooks for analysis and visualization
├── main.py                     # Main analysis pipeline
├── plot.py                     # Forest plot generation
├── requirements.txt            # Python dependencies
└── README.md                   

Quick Start

1. Installation

# Clone the repository
git clone https://github.com/cavalab/ESI.git
cd ESI

# if desired, make an environment with Python 3.11 in it using conda or mamba
mamba env create

# Create virtual environment with Python 3.11
python3.11 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install dependencies 
pip install -r requirements.txt

2. Data Preparation

tl;dr, after following the preprocessed data extraction setup:

bash binarize.sh              # binarize covariates from preprocessed data
bash run_analysis.sh results/ # run analysis, saving to results/

2.1 Data Preprocessing Pipeline

If you need to generate these files from raw data, follow this two-step process:

Step 1: Raw Data → Preprocessed Data

Use the scripts in ed-preprocessing/ for center-specific preprocessing.

This script results in preprocessed datasets, e.g.:

  • preprocessed_BIDMC.csv : Derived from the publicly available MIMIC-IV-ED dataset from Beth Israel Deaconess (Adult East)
  • preprocessed_Stanford.csv : Derived from the publicly available MC-MED dataset from Stanford Hospital (Adult West)

Step 2: Preprocessed Data → Binarized Covariates

Run the appropriate binarization script for each center:

python binarization-CHLA.py      
python binarization-BIDMC.py     
python binarization-Stanford.py  
python binarization-BCH.py       

This creates data/preprocessed_{center}.csv files which are used in the main analysis.

3. Run Complete Analysis

See run_analysis.sh, which computes the OR comparisons as follows:

python main.py \
     --path_base ${data_base_directory} \
     --mode flagged_vs_unflagged \
     --center ${center} \
     --save_dir ${results_directory}

notebooks/forest_plot.ipynb visualizes these results.

Configuration

Hospital Centers

The analysis supports four hospital centers, two of which are publicly available:

  • CHLA: Children's Hospital Los Angeles
  • BIDMC: Beth Israel Deaconess Medical Center
  • Stanford: Stanford Hospital
  • BCH: Boston Children's Hospital

Analysis Modes

  • flagged_vs_unflagged: Compares HB level 2, HB level 3, and HB level 2+3
  • all_combinations: Compares HB level 2, HB2: danger zone vitals, HB2: high risk symptoms, HB level 3

Center Configuration

Hospital-specific variables are defined in center_configs.json:

{
  "CHLA": {
    "triage_col": "esi_acuity",
    "complaint_col": "chief_complaint", 
    "race_predictor": "race_",
    "race_names": ["White", "Black", "Hispanic", "Asian"],
    "race_order": ["White", "Black", "Hispanic", "Asian"],
    "covariate_prefixes": ["age", "gender", "insurance"]
  }
}

Output Files

Output is written to {save_dir}/{center}/{mode}/:

  • complaint_with_mask_and_vitals_{center}.csv: Acuity data with high-risk flags
  • odds_ratios.csv: Odds ratios and confidence intervals
  • significance.csv: Statistical significance results

Acknowledgments

This code is used to produce the analysis in the following preprint:

Romero Mila, B., Coggan, H., Fine, A.M., Barak-Corren, Y., Reis, B.Y., Aysola, J., Chaudhari, P., La Cava, W.G., 2026. The Benefit of the Doubt Phenomenon in Emergency Triage Assignment Disparities. https://doi.org/10.64898/2026.02.12.26346184

This work was partially supported by NLM R01LM014300.

Contact: @lacava

https://cavalab.org

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors