(State: 11.05.2026)
This repository has been made publicly available at an early stage to facilitate review and collaboration while cleanup and stabilization efforts are still underway.
The work implemented thus far focuses on the OCR pipeline and the Vision-SSL prototype. Other areas are clearly marked as planned roadmap or design work.
WARNING: A major overhaul is currently in progress; unpredictable side effects may occur. Completeness is not guaranteed.
A work-in-progress ML portfolio focused on Python, machine learning, evaluation discipline, reproducible project structure, and clean engineering habits.
This portfolio is about learning by building β making steady progress through small, runnable experiments and tight feedback loops.
Principles I try to follow:
- Start small, then scale: get an end-to-end baseline working before adding complexity
- Understand before hacking: prefer reading docs, inspecting failures, and writing minimal repros over reverseβengineering libraries in the dark
- Make progress legible: scripts/configs/tests over notebooks, with decisions and results recorded
The goal isn't to look productive. It's to understand when a model works, why it fails, and how to fix it β without getting stuck in debugging hell.
Status key: π§ Active = notebooks/code/prototypes in progress | π Planned phase = roadmap/design work, no completed implementation implied
For a quick review, start with OCR Pipeline and then Vision SSL Transfer. Synthetic Data and Time Series are priority tracks, but currently remain in planned phase.
| Project | Status | Key Technologies | Planned Interconnections |
|---|---|---|---|
| OCR Pipeline | π§ Active | Tesseract, TrOCR, SVM routing | Future: LLM post-processing |
| Tabular Boosting Suite | π Planned phase | LightGBM, XGBoost, CatBoost, SHAP, TabPFN | Future AutoML input |
| Time Series Forecasting | π Planned phase | Darts, NeuralProphet, conformal | Standalone priority track |
| Project | Status | Key Technologies | Planned Interconnections |
|---|---|---|---|
| Vision SSL Transfer | π§ Active | SSL (SimCLR, MAE), SHAP, timm | Shares encoder patterns with OCR |
| Graph Neural Networks | π Planned phase | PyG, DGL, node/graph classification | Future Materials work |
| LLM Evaluation Harness | π Planned phase | lm-eval-harness, custom metrics | Future OCR post-processing evaluation |
| Quantum Machine Learning | π Planned phase | Qiskit, PennyLane, TFQ, VQC, quantum kernels | Future optimization/materials track |
| Project | Status | Key Technologies | Interconnections |
|---|---|---|---|
| Bayesian Optimization | π Planned phase | Optuna, BoTorch, Ax | Future AutoML/RL support |
| AutoML Comparison | π Planned phase | Auto-sklearn, FLAML, H2O | Roadmap item; no project directory yet |
| Project | Status | Key Technologies | Interconnections |
|---|---|---|---|
| Scientific ML - Materials | π Planned phase | JAX, equinox, crystal graphs | Future GNN application |
| RL Operations Simulator | π Planned phase | Gymnasium, Stable-Baselines3 | Future Bayesian tuning use case |
| Synthetic Data Generation | π Planned phase | CTGAN, SDV, privacy metrics | Priority planned track |
| Project | Status | Key Technologies | Notes |
|---|---|---|---|
| Framework Comparison | π Planned phase | PyTorch, TensorFlow, JAX | Roadmap item; no project directory yet |
| ONNX Export Hub | π Planned phase | ONNX, ONNX Runtime, TensorRT | Future deployment optimization |
- Python 3.13+ (deliberate choice for latest typing features; some ML libraries may lag - tested combinations documented per project)
- uv (recommended) or pip
# Clone the repository
git clone https://github.com/sm4rtm4art/machine_learning.git
cd machine_learning
# Install with uv (recommended)
uv sync --all-extras
# Or with pip
pip install -e ".[all]"
# Set up pre-commit hooks
make pre-commit-installActive projects follow a consistent CLI interface (planned projects have READMEs only):
# Example: OCR Pipeline (π§ Active)
uv run python projects/ocr_pipeline/scripts/download_data.py
uv run python projects/ocr_pipeline/scripts/train.py
uv run python projects/ocr_pipeline/scripts/evaluate.py
uv run python projects/ocr_pipeline/scripts/export.pyNote: Only projects marked "π§ Active" have implemented scripts or prototypes. Projects marked "π Planned phase" contain design documentation or roadmap notes only.
make mlflow
# Open http://localhost:5000make evidently
# Open http://localhost:8000# MLflow example (linear+FFT baseline vs Conv2D)
uv run python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py
# Same MLflow example but using ssl_2d_minimal generated samples
USE_SSL2D_SAMPLES=1 uv run python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py
# Optuna + MLflow nested trial runs (requires optuna from tabular extra)
USE_OPTUNA=1 OPTUNA_TRIALS=20 uv run --extra tabular python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py
# Evidently drift + quality report from ssl_2d_minimal generated samples
uv run --extra monitoring python projects/vision_ssl_transfer/prototypes/evidently_quickstart_example.pymachine_learning/
βββ src/ml_portfolio/ # Shared library code
β βββ common/ # Config, logging, paths
β βββ metrics/ # Evaluation metrics by domain
β βββ eval/ # Slicing, robustness, drift
β βββ tracking/ # MLflow utilities
β
βββ projects/ # Individual ML projects
β βββ _template/ # Copyable project template
β βββ <project>/
β βββ configs/ # Hydra/YAML configs
β βββ notebooks/ # Report notebooks only
β βββ scripts/ # CLI entry points
β βββ project/ # Project-specific code
β βββ tests/ # Project tests
β
βββ infra/ # Infrastructure (Docker)
β βββ mlflow/ # MLflow server
β βββ monitoring/ # Evidently for drift
β
βββ docs/ # Documentation
βββ data/ # Data directory (gitignored)
βββ artifacts/ # Model artifacts (gitignored)
βββ reports/ # Generated evaluation reports
Why this structure?
Shared library in src/: Common utilities (metrics, tracking, data loading) are reusable across projects. This avoids copy-paste and ensures consistency.
Projects as self-contained units: Each project has its own configs, scripts, and tests. You can understand a project without reading the entire repo.
Notebooks as reports only: Notebooks are for visualization and communication, not for logic. All code lives in importable modules. This makes testing possible and diffs readable.
Consistent CLI per project: Active projects follow a standard interface (download_data.py, train.py, evaluate.py, export.py, serve.py). This reduces cognitive load and enables automation. Planned projects will adopt this structure as they're implemented.
The target evaluation standard for this portfolio is to go beyond aggregate metrics:
β "Model achieves 95% accuracy"
β "Model achieves 95% accuracy overall, but 72% on edge cases involving X"
What this means in practice
Slice-based evaluation: Break down performance by meaningful subgroups (data quality, category frequency, edge cases).
Calibration: A model that says "90% confident" should be right 90% of the time. We measure this with ECE and reliability diagrams.
Robustness: How does performance degrade with noise, missing data, or distribution shift?
Decision curves: For classification, accuracy isn't enough. We analyze the tradeoff between false positives and false negatives at different thresholds.
Implemented evaluate.py scripts should move toward this artifact convention:
| Artifact | Purpose |
|---|---|
metrics.json |
Primary metrics for CI gates |
slices.csv |
Performance by subgroup |
robustness.csv |
Degradation under perturbations |
plots/ |
Visualizations (calibration, confusion, etc.) |
make lint # Run ruff linter
make format # Auto-format code
make typecheck # Run mypy
make check # All of the abovemake test # Run all tests
make test-cov # With coverage reportPre-commit hooks enforce:
- Code formatting (ruff)
- Linting (ruff)
- Type checking (mypy) on src/ and project/ modules
- Notebook output stripping
- Large file prevention
See docs/repo_conventions.md for detailed guidelines on:
- Where code should live
- Notebook policy
- CLI contracts
- Naming conventions
See docs/evaluation_standards.md for:
- Required metrics by problem type
- Artifact specifications
- Slice definitions
See docs/mlflow_conventions.md for:
- Experiment naming
- Tag schema
- Artifact organization
See infra/monitoring/evidently/README.md for:
- Evidently service usage
- Monitoring use cases
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.