Skip to content

Conversation

@zhudezhong
Copy link

Issue

Improve macOS setup & inference robustness (macOS requirements, MPS guidance, optional vLLM backend)

Summary

This PR improves the macOS developer experience and prevents inference from failing due to missing vllm. It adds a macOS-friendly requirements file, documents Apple Silicon mps usage, makes vllm optional in CRAG_Inference.py (with a Transformers fallback), and adds a repo-wide .gitignore.

Key Changes

  • Add .gitignore: Ignore Python caches, virtual envs, build artifacts, uv caches, and common ML outputs.
  • Add requirements-macos.txt: Provide a CPU-friendly dependency set for macOS users.
  • Update README (macOS section):
    • Recommend using requirements-macos.txt on macOS.
    • Document how to select mps (Metal) when available for PyTorch workloads.
    • Explain why macOS does not rely on vLLM by default (experimental support / may require source build), and that inference falls back to Transformers when vllm is unavailable.
  • Make vLLM optional in inference (scripts/CRAG_Inference.py):
    • Remove the hard vllm import.
    • Add --generator_backend {auto|vllm|transformers} (default: auto).
    • Prefer vLLM when available; otherwise fall back to a pure-Transformers generator to avoid ImportError.

Motivation

  • requirements.txt includes packages that frequently fail on macOS.
  • CRAG_Inference.py previously required vllm, causing macOS users to hit ImportError after following macOS install guidance.
  • Apple Silicon users can accelerate Torch workloads via mps.

How to Test

  • Install (macOS):
    • pip install -r requirements-macos.txt
  • Inference:
    • Default behavior (auto backend) should run without ImportError even if vllm is not installed.
    • Force Transformers backend: add --generator_backend transformers.
    • If vLLM is installed: use --generator_backend vllm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant