Skip to content

feat(vllm): vllm gb200 dsv4 recipes#103

Draft
alec-flowers wants to merge 51 commits intomainfrom
aflowers/vllm-gb200-v0.20.0
Draft

feat(vllm): vllm gb200 dsv4 recipes#103
alec-flowers wants to merge 51 commits intomainfrom
aflowers/vllm-gb200-v0.20.0

Conversation

@alec-flowers
Copy link
Copy Markdown
Collaborator

@alec-flowers alec-flowers commented Apr 28, 2026

Draft PR for the vLLM GB200 v0.20.0 branch.

Summary:

  • Adds a self-contained lm-eval benchmark runner for GSM8K-style evals against an OpenAI-compatible chat endpoint.
  • Keeps the existing SGLang-dependent gsm8k runner untouched; this new path uses python3 -m lm_eval --model local-chat-completions and does not require SGLang or an InferenceX workspace mount.
  • Bundles the GSM8K task YAML, score thresholds, and score validator under src/srtctl/benchmarks/scripts/lm-eval/.
  • Updates only recipes/vllm/deepseek-v4-pro/GB200/8k1k/disagg-gb200-1p4d-dep8-tp8-c256-c512-offload.yaml to run lm-eval with the bundled GSM8K task and VALIDATE_EVAL_SCORES=true.

Validation:

  • Local smoke: launched a small Qwen/Qwen2.5-0.5B-Instruct OpenAI-compatible chat endpoint and ran the bundled script with EVAL_LIMIT=2, EVAL_NUM_FEWSHOT=0, EVAL_CONC=1; it produced meta_env.json, results_*.json, and samples_*.jsonl successfully.
  • bash -n src/srtctl/benchmarks/scripts/lm-eval/bench.sh passed.
  • python3 -m py_compile src/srtctl/benchmarks/lm_eval.py src/srtctl/cli/do_sweep.py src/srtctl/benchmarks/scripts/lm-eval/validate_scores.py passed.
  • Focused tests passed: tests/test_benchmarks.py::TestLMEvalRunner, TestRunPostEval, and TestScriptsExist.
  • UV_DEFAULT_INDEX=https://pypi.org/simple make check passed (635 passed, 2 skipped, 6 deselected).
  • The known ty diagnostic in src/srtctl/core/validation.py is still emitted under the existing || true Makefile behavior.

@alec-flowers alec-flowers changed the title chore: track vLLM GB200 v0.20.0 baseline feat(vllm): run DSv4 1p4d GSM8K via lm-eval Apr 28, 2026
@alec-flowers alec-flowers force-pushed the aflowers/vllm-gb200-v0.20.0 branch from a48fe36 to 4acf4ea Compare April 28, 2026 03:39
@alec-flowers alec-flowers changed the title feat(vllm): run DSv4 1p4d GSM8K via lm-eval feat(vllm): vllm gb200 dsv4 recipes Apr 28, 2026
@alec-flowers alec-flowers force-pushed the aflowers/vllm-gb200-v0.20.0 branch from 707e933 to c6df50a Compare April 28, 2026 03:55
@alec-flowers alec-flowers force-pushed the aflowers/vllm-gb200-v0.20.0 branch from 406e5b4 to 7beaa58 Compare April 29, 2026 01:11
@alec-flowers alec-flowers force-pushed the aflowers/vllm-gb200-v0.20.0 branch from 7beaa58 to 653a652 Compare April 29, 2026 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant