Skip to content

Latest commit

 

History

History
executable file
·
47 lines (40 loc) · 3.06 KB

File metadata and controls

executable file
·
47 lines (40 loc) · 3.06 KB

Project Review: Rust vs Python ML Benchmark System

Scope and Objectives

  • Comprehensive cross-language benchmarking across four domains: Classical ML, Deep Learning (CNN/RNN), Reinforcement Learning, and LLMs.
  • Orchestrated with Nextflow, producing structured JSON metrics and analysis artifacts.
  • Dual implementations (Python/Rust) emphasizing performance, resource usage, and result quality.

Current Build and Smoke Status

  • Rust workspace compiles clean after dependency alignment:
    • Upgraded tch to 0.14 in cnn_benchmark, rnn_benchmark, dqn_benchmark, and policy_gradient_benchmark to ensure C++17 compatibility on macOS 14.
    • Classical ML crates (regression_benchmark, svm_benchmark, clustering_benchmark) compile without errors using linfa.
  • Smoke workflow:
    • CNN, LLM, RL, RNN: green
    • Python Classical ML: requires local venv and pip install -r requirements.txt, then nextflow run workflows/smoke.nf -resume.

Implementation Quality Assessment

  • Classical ML (Rust/Python):
    • Uses linfa (Rust) and scikit-learn (Python) with standard algorithms and robust metrics (RMSE/MAE/R² for regression; Accuracy/F1/ROC where applicable for classification; clustering indices).
    • Deterministic seeds and consistent data splits improve reproducibility.
  • Deep Learning (Rust/Python):
    • CNN/RNN built on tch (Rust) and PyTorch (Python); supports common architectures and captures timing/resource metrics.
    • Upgraded tch mitigates C++ standard mismatches on macOS.
  • Reinforcement Learning (Rust/Python):
    • Rust RL crates build clean with tch; Python uses stable-baselines3.
    • Consistent metrics surface (training time, mean reward, inference speed) across languages.
  • LLMs:
    • Smoke validated on Python LLM tasks; Rust LLM crates exist in repo but are not part of the current workspace members.

Reproducibility and Usage

  • Python setup (recommended):
    • python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
  • Rust build (entire workspace):
    • cargo clean && cargo build --workspace --release
  • Smoke workflow:
    • nextflow run workflows/smoke.nf
    • If Python Classical ML tasks fail due to deps: re-run with -resume after installing requirements.

Known Limitations and Improvements

  • Add pytest-based smoke tests that execute a minimal run per category and validate JSON schema outputs.
  • Tighten Rust metric calculations where simplified (e.g., some percentiles/aggregation) and add unit tests for metric helpers.
  • Provide a small Nextflow profile for local E2E runs and ensure container/local profiles are clearly documented.
  • Consider promoting Rust LLM crates into the workspace once they reach the same fidelity as other domains.

Conclusion

  • The project is in strong shape: Rust workspace builds clean, and the smoke workflow is green across CNN, LLM, RL, and RNN, with a clear remediation path for Python Classical ML via venv + requirements installation.
  • Documentation is aligned (USERGUIDE.md, README.md, SPECS.md) and contains actionable guidance for users to reproduce results.