Skip to content

feat: add MLX backend for local Apple Silicon RL training#15

Open
alvin-chang wants to merge 8 commits intoaiming-lab:mainfrom
alvin-chang:mlx-backend
Open

feat: add MLX backend for local Apple Silicon RL training#15
alvin-chang wants to merge 8 commits intoaiming-lab:mainfrom
alvin-chang:mlx-backend

Conversation

@alvin-chang
Copy link
Copy Markdown

"Adds a local RL training backend using Apple MLX, so MetaClaw can run
the full train loop (inference → GRPO → weight update → hot-swap) on
Apple Silicon without any cloud API.

Changes (13 files, +1805 -39)

  • metaclaw/mlx_backend/ — Tinker-compatible adapter package backed by mlx + mlx-lm
  • metaclaw/sdk_backend.py — auto-detection falls through to MLX when no cloud credentials
  • metaclaw/api_server.py — guard run_llm when no llm_api_key; skip Tinker-specific sample_async kwargs
  • metaclaw/setup_wizard.py — add mlx backend choice, skip API key prompts, validate deps
  • tests/ — smoke test (16/16), unit tests, integration tests

Testing

```bash
MLX_TEST_MODEL=mlx-community/Qwen2.5-1.5B-Instruct-4bit python3 tests/smoke_mlx_proxy.py
```

16/16 passing: auto-detection → model load → proxy startup → chat completions
with logprobs → sample collection → forward/backward + optim → hot-swap → post-swap inference."

- metaclaw/mlx_backend/: Tinker-compatible adapter package (ServiceClient,
  SamplingClient, LoraTrainingClient) backed by mlx and mlx-lm
- metaclaw/sdk_backend.py: auto-detection falls through to MLX when no
  cloud credentials are set
- metaclaw/api_server.py: guard run_llm when no llm_api_key configured;
  skip Tinker-specific sample_async kwargs for MLX
- metaclaw/setup_wizard.py: add 'mlx' backend choice, skip API key
  prompts, validate mlx/mlx-lm install, default to HF MLX model ID
- tests/: end-to-end smoke test + unit tests (16/16 passing)
@huaxiuyao huaxiuyao requested a review from ImKeTT March 15, 2026 22:24
@ImKeTT
Copy link
Copy Markdown
Collaborator

ImKeTT commented Mar 16, 2026

Hi, @alvin-chang, thank you for contributing, the code looks generally good. Only minor changes are needed to be compatible with the style of the codebase :)

@ImKeTT
Copy link
Copy Markdown
Collaborator

ImKeTT commented Mar 18, 2026

Hi @alvin-chang, friendly ping here to make sure that you are still working on this PR

@alvin-chang
Copy link
Copy Markdown
Author

Let me remove those and resubmit the PR.

@ImKeTT
Copy link
Copy Markdown
Collaborator

ImKeTT commented Mar 20, 2026

Hi @alvin-chang are you still working on this PR? I saw there's one last comment remaining unresolved. Feel free to ping me once you finished all the changes

@alvin-chang
Copy link
Copy Markdown
Author

I thought I resolved all of them. Is it the METACLAW_RL_BACKEND?

@ImKeTT
Copy link
Copy Markdown
Collaborator

ImKeTT commented Mar 20, 2026

ah it's the INTEGRATION_NOTES.md. could you integrate it into a small section in readme?

Alvin Chang and others added 5 commits March 23, 2026 19:10
- Replace reference to INTEGRATION_NOTES.md with detailed MLX backend integration information directly in README.md
- Add comprehensive MLX backend details including file structure, configuration changes, and setup requirements
- Create CLAUDE.md with project guidance for Claude Code instances

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update MLX backend section to include usage instructions
- Add installation and configuration details for MLX support
- Document how MLX support integrates alongside other backends (Tinker, MinT)
- Include configuration options and environment variable usage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ImKeTT
Copy link
Copy Markdown
Collaborator

ImKeTT commented Mar 24, 2026

Hi @alvin-chang, thanks again for contributing. The commit overall looks good, some minor typos/doc fixes are sill needed, see my comments above, ping me once you finish them, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants