Skip to content

sm4rtm4art/machine_learning

Repository files navigation

ML Portfolio

CI Python 3.13+ License: Apache 2.0 Code style: ruff MLflow convention

🚧 Early Public Preview / Work in Progress 🚧

(State: 11.05.2026)

This repository has been made publicly available at an early stage to facilitate review and collaboration while cleanup and stabilization efforts are still underway.

The work implemented thus far focuses on the OCR pipeline and the Vision-SSL prototype. Other areas are clearly marked as planned roadmap or design work.

WARNING: A major overhaul is currently in progress; unpredictable side effects may occur. Completeness is not guaranteed.


A work-in-progress ML portfolio focused on Python, machine learning, evaluation discipline, reproducible project structure, and clean engineering habits.

Philosophy

This portfolio is about learning by building β€” making steady progress through small, runnable experiments and tight feedback loops.

Principles I try to follow:

  • Start small, then scale: get an end-to-end baseline working before adding complexity
  • Understand before hacking: prefer reading docs, inspecting failures, and writing minimal repros over reverse‑engineering libraries in the dark
  • Make progress legible: scripts/configs/tests over notebooks, with decisions and results recorded

The goal isn't to look productive. It's to understand when a model works, why it fails, and how to fix it β€” without getting stuck in debugging hell.


Projects

Status key: 🚧 Active = notebooks/code/prototypes in progress | πŸ“‹ Planned phase = roadmap/design work, no completed implementation implied

For a quick review, start with OCR Pipeline and then Vision SSL Transfer. Synthetic Data and Time Series are priority tracks, but currently remain in planned phase.

Core ML

Project Status Key Technologies Planned Interconnections
OCR Pipeline 🚧 Active Tesseract, TrOCR, SVM routing Future: LLM post-processing
Tabular Boosting Suite πŸ“‹ Planned phase LightGBM, XGBoost, CatBoost, SHAP, TabPFN Future AutoML input
Time Series Forecasting πŸ“‹ Planned phase Darts, NeuralProphet, conformal Standalone priority track

Advanced Architectures

Project Status Key Technologies Planned Interconnections
Vision SSL Transfer 🚧 Active SSL (SimCLR, MAE), SHAP, timm Shares encoder patterns with OCR
Graph Neural Networks πŸ“‹ Planned phase PyG, DGL, node/graph classification Future Materials work
LLM Evaluation Harness πŸ“‹ Planned phase lm-eval-harness, custom metrics Future OCR post-processing evaluation
Quantum Machine Learning πŸ“‹ Planned phase Qiskit, PennyLane, TFQ, VQC, quantum kernels Future optimization/materials track

Optimization & Meta-Learning

Project Status Key Technologies Interconnections
Bayesian Optimization πŸ“‹ Planned phase Optuna, BoTorch, Ax Future AutoML/RL support
AutoML Comparison πŸ“‹ Planned phase Auto-sklearn, FLAML, H2O Roadmap item; no project directory yet

Scientific & Applied

Project Status Key Technologies Interconnections
Scientific ML - Materials πŸ“‹ Planned phase JAX, equinox, crystal graphs Future GNN application
RL Operations Simulator πŸ“‹ Planned phase Gymnasium, Stable-Baselines3 Future Bayesian tuning use case
Synthetic Data Generation πŸ“‹ Planned phase CTGAN, SDV, privacy metrics Priority planned track

Infrastructure

Project Status Key Technologies Notes
Framework Comparison πŸ“‹ Planned phase PyTorch, TensorFlow, JAX Roadmap item; no project directory yet
ONNX Export Hub πŸ“‹ Planned phase ONNX, ONNX Runtime, TensorRT Future deployment optimization

Quick Start

Prerequisites

  • Python 3.13+ (deliberate choice for latest typing features; some ML libraries may lag - tested combinations documented per project)
  • uv (recommended) or pip

Installation

# Clone the repository
git clone https://github.com/sm4rtm4art/machine_learning.git
cd machine_learning

# Install with uv (recommended)
uv sync --all-extras

# Or with pip
pip install -e ".[all]"

# Set up pre-commit hooks
make pre-commit-install

Running a Project

Active projects follow a consistent CLI interface (planned projects have READMEs only):

# Example: OCR Pipeline (🚧 Active)
uv run python projects/ocr_pipeline/scripts/download_data.py
uv run python projects/ocr_pipeline/scripts/train.py
uv run python projects/ocr_pipeline/scripts/evaluate.py
uv run python projects/ocr_pipeline/scripts/export.py

Note: Only projects marked "🚧 Active" have implemented scripts or prototypes. Projects marked "πŸ“‹ Planned phase" contain design documentation or roadmap notes only.

Start MLflow

make mlflow
# Open http://localhost:5000

Start Evidently

make evidently
# Open http://localhost:8000

Prototype Tracking and Monitoring Examples

# MLflow example (linear+FFT baseline vs Conv2D)
uv run python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py

# Same MLflow example but using ssl_2d_minimal generated samples
USE_SSL2D_SAMPLES=1 uv run python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py

# Optuna + MLflow nested trial runs (requires optuna from tabular extra)
USE_OPTUNA=1 OPTUNA_TRIALS=20 uv run --extra tabular python projects/vision_ssl_transfer/prototypes/mlflow_quickstart_example.py

# Evidently drift + quality report from ssl_2d_minimal generated samples
uv run --extra monitoring python projects/vision_ssl_transfer/prototypes/evidently_quickstart_example.py

Repository Structure

machine_learning/
β”œβ”€β”€ src/ml_portfolio/        # Shared library code
β”‚   β”œβ”€β”€ common/              # Config, logging, paths
β”‚   β”œβ”€β”€ metrics/             # Evaluation metrics by domain
β”‚   β”œβ”€β”€ eval/                # Slicing, robustness, drift
β”‚   └── tracking/            # MLflow utilities
β”‚
β”œβ”€β”€ projects/                # Individual ML projects
β”‚   β”œβ”€β”€ _template/           # Copyable project template
β”‚   └── <project>/
β”‚       β”œβ”€β”€ configs/         # Hydra/YAML configs
β”‚       β”œβ”€β”€ notebooks/       # Report notebooks only
β”‚       β”œβ”€β”€ scripts/         # CLI entry points
β”‚       β”œβ”€β”€ project/         # Project-specific code
β”‚       └── tests/           # Project tests
β”‚
β”œβ”€β”€ infra/                   # Infrastructure (Docker)
β”‚   β”œβ”€β”€ mlflow/              # MLflow server
β”‚   └── monitoring/          # Evidently for drift
β”‚
β”œβ”€β”€ docs/                    # Documentation
β”œβ”€β”€ data/                    # Data directory (gitignored)
β”œβ”€β”€ artifacts/               # Model artifacts (gitignored)
└── reports/                 # Generated evaluation reports
Why this structure?

Shared library in src/: Common utilities (metrics, tracking, data loading) are reusable across projects. This avoids copy-paste and ensures consistency.

Projects as self-contained units: Each project has its own configs, scripts, and tests. You can understand a project without reading the entire repo.

Notebooks as reports only: Notebooks are for visualization and communication, not for logic. All code lives in importable modules. This makes testing possible and diffs readable.

Consistent CLI per project: Active projects follow a standard interface (download_data.py, train.py, evaluate.py, export.py, serve.py). This reduces cognitive load and enables automation. Planned projects will adopt this structure as they're implemented.


Evaluation Philosophy

The target evaluation standard for this portfolio is to go beyond aggregate metrics:

Beyond Aggregate Metrics

βœ— "Model achieves 95% accuracy"
βœ“ "Model achieves 95% accuracy overall, but 72% on edge cases involving X"
What this means in practice

Slice-based evaluation: Break down performance by meaningful subgroups (data quality, category frequency, edge cases).

Calibration: A model that says "90% confident" should be right 90% of the time. We measure this with ECE and reliability diagrams.

Robustness: How does performance degrade with noise, missing data, or distribution shift?

Decision curves: For classification, accuracy isn't enough. We analyze the tradeoff between false positives and false negatives at different thresholds.

Standard Artifacts

Implemented evaluate.py scripts should move toward this artifact convention:

Artifact Purpose
metrics.json Primary metrics for CI gates
slices.csv Performance by subgroup
robustness.csv Degradation under perturbations
plots/ Visualizations (calibration, confusion, etc.)

Development

Code Quality

make lint        # Run ruff linter
make format      # Auto-format code
make typecheck   # Run mypy
make check       # All of the above

Testing

make test        # Run all tests
make test-cov    # With coverage report

Pre-commit Hooks

Pre-commit hooks enforce:

  • Code formatting (ruff)
  • Linting (ruff)
  • Type checking (mypy) on src/ and project/ modules
  • Notebook output stripping
  • Large file prevention

Conventions

See docs/repo_conventions.md for detailed guidelines on:

  • Where code should live
  • Notebook policy
  • CLI contracts
  • Naming conventions

See docs/evaluation_standards.md for:

  • Required metrics by problem type
  • Artifact specifications
  • Slice definitions

See docs/mlflow_conventions.md for:

  • Experiment naming
  • Tag schema
  • Artifact organization

See infra/monitoring/evidently/README.md for:

  • Evidently service usage
  • Monitoring use cases

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


About

This project is an experimental project for testing out ideas and/or expanding the knowledge.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors