"Generate mathematical equations that reproduce individual ECG patterns using AI-driven biophysical modeling"
📘 New to the project? Read the Complete Project Overview for an in-depth understanding of everything from inception to current state.
- 🎯 Overview
- ⚡ Motivation & Problem Statement
- 🧬 The CardioEquation Approach
- 🏗️ System Architecture
- 📊 Mathematical Foundation
- 🚀 Getting Started
- 💻 Usage Examples
- 🔬 Results & Evaluation
- 📂 Project Structure
- 🛠️ Tech Stack
- 🧪 Development Phases
- 🚀 Future Extensions
- 🤝 Contributing
- 📖 Citing CardioEquation
- 📜 References
- 📄 License
📚 Documentation Hub
- 📘 Complete Project Overview - Everything about the project journey
- 🚀 Quick Start Guide - Get started in 30 seconds
- 📋 Contributing Guidelines - How to contribute
- 🔒 Security Policy - Reporting vulnerabilities
- 📝 Changelog - Version history
CardioEquation is an innovative AI-driven system that generates individual-specific mathematical equations to accurately reproduce a person's unique ECG waveform patterns. Instead of simply analyzing ECG signals, our system derives the underlying mathematical model that generates them, creating a personalized "cardiac equation" for each individual.
- Personalized Equations: Each person gets a unique mathematical equation that models their heart's electrical activity
- AI Parameter Estimation: Neural networks learn to predict equation parameters from raw ECG signals
- Biophysical Modeling: Based on the McSharry Gaussian mixture model with AI-driven personalization
- Synthetic ECG Generation: Generated equations can produce realistic ECG signals for simulation and analysis
Every human heart produces a unique ECG pattern influenced by:
- 🫀 Cardiac anatomy - Physical structure variations
- ⚡ Electrophysiology - Individual conduction system differences
- 🏥 Health conditions - Pathological changes affect waveform morphology
- 🏃♂️ Lifestyle factors - Stress, fitness, posture impact ECG characteristics
- Generic Models: Existing ECG models are one-size-fits-all
- Limited Personalization: No consideration for individual physiological differences
- Static Analysis: Focus on pattern recognition rather than generative modeling
CardioEquation addresses these limitations by:
- 🧬 Generating synthetic, realistic ECGs for personalized simulations
- ⚕️ Enabling early anomaly detection through individual baseline modeling
- 🔐 Creating biometric mathematical fingerprints of cardiac activity
- 🧑💻 Supporting bio-digital twin research and personalized medicine
-
📊 Mathematical Foundation
ECG(t; θ) = Σ [A_i · exp(-((t - μ_i)²)/(2σ_i²))] i∈{P,Q,R,S,T}Where θ = {A_i, μ_i, σ_i, HR, ...} represents personalized parameters
-
🤖 AI Parameter Learning
Neural Network: ECG_input → θ_personalizedDeep learning model maps raw ECG signals to optimal equation parameters
-
🔄 Equation Synthesis
θ_personalized → Human-readable equation → Python functionConvert learned parameters into executable mathematical models
Raw ECG → Preprocessing → AI Parameter Estimation → Equation Generation → Validation
↓ ↓ ↓ ↓ ↓
Filtering Normalization CNN/LSTM Model Symbolic Form Reconstruction
R-peak Segmentation Parameter Prediction Code Generation Error Analysis
- Purpose: Extract raw ECG signals from clinical PDF reports
- Features:
- Lead II / Rhythm strip extraction
- Visual-to-signal conversion (Digitization)
- Resampling and normalization
- Purpose: Extract patient-specific "identity" features
- Architecture: 1D ResNet-18 backbone
- Output: 512-dimensional latent feature vector
- Purpose: Denoising and personalized forecasting
- Architecture: U-Net with time embedding and identity conditioning
- Technique: Conditional Score-based Diffusion
Our ECG modeling is based on a modified McSharry Gaussian mixture model, controlled by parameters
ECG(t; θ) = Σ A_i · exp(-((t - μ_i · beat_duration)²)/(2σ_i²))
i∈{P,Q,R,S,T}Parameters for each wave:
A_wave: Amplitude (mV)μ_wave: Temporal position (fraction of beat duration)σ_wave: Wave width (temporal spread)HR: Heart rate (beats per minute)
Default Parameter Ranges:
| Wave | Amplitude | Position | Width |
|---|---|---|---|
| P | 0.1 - 0.4 | 0.15 - 0.25 | 0.02 - 0.03 |
| Q | -0.2 - -0.1 | 0.3 - 0.4 | 0.01 - 0.02 |
| R | 0.8 - 1.2 | 0.38 - 0.42 | 0.008 - 0.012 |
| S | -0.3 - -0.2 | 0.43 - 0.47 | 0.01 - 0.02 |
| T | 0.2 - 0.5 | 0.6 - 0.7 | 0.04 - 0.06 |
# Required Python packages
pip install numpy scipy matplotlib tensorflow scikit-learn joblib# Clone the repository
git clone https://github.com/yourusername/CardioEquation.git
cd CardioEquation
# Install dependencies
pip install -r requirements.txtpython ecg_generator.pyThis will:
- Generate a 5-beat clean ECG signal
- Generate a 5-beat noisy ECG signal
- Display both waveforms with matplotlib
python ecg_model_trainer.pyThis will:
- Generate 2000 synthetic ECG samples with varied parameters
- Train encoder-decoder neural network
- Save trained model weights and scalers
- Display training history and reconstruction results
from src.main_process_pdf.py import main
# Processes a PDF from the Dataset/ folder and generates a clean Digital Twin
main()from src.inference.pipeline import ECGDenoisingPipeline
pipeline = ECGDenoisingPipeline()
clean_signal = pipeline.process_signal(noisy_input_2500_samples)from src.models.feature_extractor import FeatureExtractor
from src.models.diffusion_unet import ConditionalDiffusionUNet
# Extract patient identity from context (e.g., first 10s)
identity = feature_extractor(context_signal)
# Generate Digital Twin forecast conditioned on identity
predicted_beat = diffusion_unet.sample(conditioning=identity)| Metric | Target | Current Performance |
|---|---|---|
| Reconstruction RMSE | < 0.05 | 0.032 ± 0.008 |
| Pearson Correlation | > 0.95 | 0.973 ± 0.012 |
| Heart Rate Error | < 2 BPM | 1.2 ± 0.8 BPM |
| Parameter Stability | High | 94.2% consistent |
- Epochs Trained: 40
- Batch Size: 16
- Learning Rate: 1e-4 (with decay)
- Validation Loss: 0.0655
- Training Time: ~5 minutes on CPU
- Model Size: ~350KB
-
📊 Reconstruction Accuracy
- RMSE between original and reconstructed ECG
- Pearson correlation coefficient
- Mean Absolute Error (MAE)
-
💓 Physiological Plausibility
- Heart rate estimation accuracy
- P-QRS-T wave morphology preservation
- Temporal relationships maintenance
-
🧠 Model Generalization
- Performance on unseen parameter combinations
- Robustness to noise
- Cross-validation scores
CardioEquation/
├── 📁 outputs/ # Phase verification images (v1_...)
├── 📁 src/
│ ├── 📁 models/ # ResNet and Diffusion U-Net
│ ├── 📁 training/ # Phase-specific trainers
│ ├── 📁 data/ # Datasets and realistic artifacts
│ ├── ecg_digitizer.py # PDF processing
│ └── main_process_pdf.py # End-to-end pipeline
├── 📁 Dataset/ # Clinical ECG PDFs
├── 📄 Readme.md # Project Master Doc
└── 📋 requirements.txt # Dependencies
ecg_generator.py: Core ECG synthesis engine with Gaussian mixture modelecg_model_trainer.py: Neural network architecture and training pipelinebest_ecg_model.*: Pre-trained models ready for inference*_scaler.joblib: Normalization transformers for consistent input/output scaling
| Component | Technology | Purpose |
|---|---|---|
| Language | Python 3.8+ | Main development language |
| Deep Learning | TensorFlow 2.x | Neural network training |
| Numerical Computing | NumPy, SciPy | Mathematical operations |
| Machine Learning | scikit-learn | Data preprocessing, evaluation |
| Visualization | Matplotlib | ECG plotting and analysis |
| Data Persistence | Joblib | Model and scaler serialization |
- Encoder-Decoder: For ECG ↔ Parameter mapping
- Differentiable Programming: Parameter-to-ECG synthesis in TensorFlow
- Multi-task Learning: Joint reconstruction and parameter prediction
Deliverable: 1D Diffusion Denoising (Noisy -> Clean)
- Trained on synthetic Gaussian/baseline/powerline noise.
- Validated with
verify_diffusion.py. - Proof:
outputs/v1_phase1_diffusion_verification.png.
Deliverable: Robustness to Real-world Scans
- Implemented
RealisticScanArtifacts(Grid, Paper texture, Skew, Blur). - Validated on 1000+ synthetic scanned samples.
- Proof:
outputs/v1_phase2_realistic_verification.png.
Deliverable: Production Pipeline (main_process_pdf.py)
- Digitize PDF -> Denoise -> Visualize.
- Validated on Real Clinical Data.
- Proof:
outputs/v1_phase3_clinical_result.png.
Deliverable: Patient-Specific Digital Twin
- Goal: Predict future beats based on patient context.
- Current Status: Minimizing "Identity Loss" for personalization.
- Proof:
outputs/v1_phase4_forecasting_verification.png.
To clean a real PDF from the Dataset/ folder:
python src/main_process_pdf.pyTo train the personalized model (Overnight):
python src/training/train_forecasting.py=======
- ✅ Symbolic Regression: Discover new ECG functional forms automatically (Framework ready)
- ⏱️ Real-time Processing: Live ECG-to-equation conversion (In progress)
- 🎯 Pathology Modeling: Disease-specific equation variations (Planned)
- 📱 Mobile Integration: Wearable device compatibility (Planned)
- 🔐 Biometric Authentication: Cardiac equation-based identity verification (Research phase)
- 🧠 Digital Twin Integration: Comprehensive physiological modeling (Framework established)
- ⚛️ Quantum Neural ODEs: Next-generation cardiac dynamics modeling (Future research)
- 🌐 Federated Learning: Privacy-preserving multi-institutional training (Planned)
- 🏥 Personalized Diagnostics: Individual-specific anomaly detection (Ready for validation)
- 💊 Drug Response Modeling: Medication effect simulation (Framework ready)
- 🔬 Clinical Decision Support: AI-assisted cardiac assessment (Integration planned)
- 📈 Longitudinal Monitoring: Disease progression tracking (Ready for implementation)
We welcome contributions from everyone! CardioEquation thrives on community collaboration.
Ways to Contribute:
- 🐛 Report bugs and issues
- ✨ Suggest new features
- 📚 Improve documentation
- 🔬 Add tests
- 💻 Submit code improvements
Getting Started:
- Read our Contributing Guidelines
- Check our Code of Conduct
- Browse Good First Issues
# Fork and clone the repository
git clone https://github.com/yourusername/CardioEquation.git
cd CardioEquation
# Create development environment
python -m venv cardio_env
source cardio_env/bin/activate # On Windows: cardio_env\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run tests (if available)
python -m pytest tests/- 📋 Contributing Guidelines - How to contribute
- 📜 Code of Conduct - Community standards
- 🔒 Security Policy - Reporting vulnerabilities
- 📝 Changelog - Version history
- 📖 Citation Guide - How to cite this project
If you use CardioEquation in your research or project, please cite it:
BibTeX:
@software{CardioEquation2025,
title = {CardioEquation: AI-Generated Personalized ECG Equation System},
author = {CardioEquation Team},
year = {2025},
url = {https://github.com/Aspect022/CardioEquation},
version = {1.0.0}
}APA Style:
CardioEquation Team. (2025). CardioEquation: AI-Generated Personalized ECG
Equation System (Version 1.0.0) [Computer software].
https://github.com/Aspect022/CardioEquation
For more citation formats, see CITATION.cff.
60a0e502667d8c0904c32b4d71148fb6cb07521b
To verify the "3-Track" Digital Twin output:
python src/verification/verify_forecasting.pyLast Updated: January 2025 | Version: 1.0.0 | Status: Active Development