feat: add MLX backend for local Apple Silicon RL training#15
feat: add MLX backend for local Apple Silicon RL training#15alvin-chang wants to merge 8 commits intoaiming-lab:mainfrom
Conversation
- metaclaw/mlx_backend/: Tinker-compatible adapter package (ServiceClient, SamplingClient, LoraTrainingClient) backed by mlx and mlx-lm - metaclaw/sdk_backend.py: auto-detection falls through to MLX when no cloud credentials are set - metaclaw/api_server.py: guard run_llm when no llm_api_key configured; skip Tinker-specific sample_async kwargs for MLX - metaclaw/setup_wizard.py: add 'mlx' backend choice, skip API key prompts, validate mlx/mlx-lm install, default to HF MLX model ID - tests/: end-to-end smoke test + unit tests (16/16 passing)
|
Hi, @alvin-chang, thank you for contributing, the code looks generally good. Only minor changes are needed to be compatible with the style of the codebase :) |
|
Hi @alvin-chang, friendly ping here to make sure that you are still working on this PR |
|
Let me remove those and resubmit the PR. |
|
Hi @alvin-chang are you still working on this PR? I saw there's one last comment remaining unresolved. Feel free to ping me once you finished all the changes |
|
I thought I resolved all of them. Is it the METACLAW_RL_BACKEND? |
|
ah it's the INTEGRATION_NOTES.md. could you integrate it into a small section in readme? |
…into mlx-backend
- Replace reference to INTEGRATION_NOTES.md with detailed MLX backend integration information directly in README.md - Add comprehensive MLX backend details including file structure, configuration changes, and setup requirements - Create CLAUDE.md with project guidance for Claude Code instances Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update MLX backend section to include usage instructions - Add installation and configuration details for MLX support - Document how MLX support integrates alongside other backends (Tinker, MinT) - Include configuration options and environment variable usage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hi @alvin-chang, thanks again for contributing. The commit overall looks good, some minor typos/doc fixes are sill needed, see my comments above, ping me once you finish them, thanks! |
"Adds a local RL training backend using Apple MLX, so MetaClaw can run
the full train loop (inference → GRPO → weight update → hot-swap) on
Apple Silicon without any cloud API.
Changes (13 files, +1805 -39)
Testing
```bash
MLX_TEST_MODEL=mlx-community/Qwen2.5-1.5B-Instruct-4bit python3 tests/smoke_mlx_proxy.py
```
16/16 passing: auto-detection → model load → proxy startup → chat completions
with logprobs → sample collection → forward/backward + optim → hot-swap → post-swap inference."