feat: add MLX backend for local Apple Silicon RL training by alvin-chang · Pull Request #15 · aiming-lab/MetaClaw

alvin-chang · 2026-03-15T11:21:54Z

"Adds a local RL training backend using Apple MLX, so MetaClaw can run
the full train loop (inference → GRPO → weight update → hot-swap) on
Apple Silicon without any cloud API.

Changes (13 files, +1805 -39)

metaclaw/mlx_backend/ — Tinker-compatible adapter package backed by mlx + mlx-lm
metaclaw/sdk_backend.py — auto-detection falls through to MLX when no cloud credentials
metaclaw/api_server.py — guard run_llm when no llm_api_key; skip Tinker-specific sample_async kwargs
metaclaw/setup_wizard.py — add mlx backend choice, skip API key prompts, validate deps
tests/ — smoke test (16/16), unit tests, integration tests

Testing

```bash
MLX_TEST_MODEL=mlx-community/Qwen2.5-1.5B-Instruct-4bit python3 tests/smoke_mlx_proxy.py
```

16/16 passing: auto-detection → model load → proxy startup → chat completions
with logprobs → sample collection → forward/backward + optim → hot-swap → post-swap inference."

- metaclaw/mlx_backend/: Tinker-compatible adapter package (ServiceClient, SamplingClient, LoraTrainingClient) backed by mlx and mlx-lm - metaclaw/sdk_backend.py: auto-detection falls through to MLX when no cloud credentials are set - metaclaw/api_server.py: guard run_llm when no llm_api_key configured; skip Tinker-specific sample_async kwargs for MLX - metaclaw/setup_wizard.py: add 'mlx' backend choice, skip API key prompts, validate mlx/mlx-lm install, default to HF MLX model ID - tests/: end-to-end smoke test + unit tests (16/16 passing)

tests/test_mlx_backend.py

metaclaw/sdk_backend.py

ImKeTT · 2026-03-16T03:57:46Z

Hi, @alvin-chang, thank you for contributing, the code looks generally good. Only minor changes are needed to be compatible with the style of the codebase :)

ImKeTT · 2026-03-18T14:38:50Z

Hi @alvin-chang, friendly ping here to make sure that you are still working on this PR

alvin-chang · 2026-03-18T14:44:40Z

Let me remove those and resubmit the PR.

ImKeTT · 2026-03-20T07:03:17Z

Hi @alvin-chang are you still working on this PR? I saw there's one last comment remaining unresolved. Feel free to ping me once you finished all the changes

alvin-chang · 2026-03-20T09:06:05Z

I thought I resolved all of them. Is it the METACLAW_RL_BACKEND?

ImKeTT · 2026-03-20T13:28:18Z

ah it's the INTEGRATION_NOTES.md. could you integrate it into a small section in readme?

…into mlx-backend

- Replace reference to INTEGRATION_NOTES.md with detailed MLX backend integration information directly in README.md - Add comprehensive MLX backend details including file structure, configuration changes, and setup requirements - Create CLAUDE.md with project guidance for Claude Code instances Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Update MLX backend section to include usage instructions - Add installation and configuration details for MLX support - Document how MLX support integrates alongside other backends (Tinker, MinT) - Include configuration options and environment variable usage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ImKeTT · 2026-03-24T18:25:38Z

Hi @alvin-chang, thanks again for contributing. The commit overall looks good, some minor typos/doc fixes are sill needed, see my comments above, ping me once you finish them, thanks!

huaxiuyao requested a review from ImKeTT March 15, 2026 22:24