Conversation
Signed-off-by: seungrokj <seungrok.jung@amd.com>
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
Claude finished @seungrokj's task in 1m 22s —— View job PR #910 Review
1 blocking issue found. 🔴 The benchmark script changes look good — server launch command is properly formatted with arguments on separate lines, optimization flags ( |
Signed-off-by: seungrokj <seungrok.jung@amd.com>
Signed-off-by: seungrokj <seungrok.jung@amd.com>
Signed-off-by: seungrokj <seungrok.jung@amd.com>
Signed-off-by: seungrokj <seungrok.jung@amd.com>
There was a problem hiding this comment.
LGTM — straightforward Docker image bump and benchmark server tuning flags.
Extended reasoning...
Overview
This PR touches two files: a Docker image version bump in .github/configs/amd-master.yaml (from v0.5.8.post1 to v0.5.9) and optimization flags added to the benchmark script benchmarks/single_node/qwen3.5_fp8_mi355x.sh. The script changes include switching from python3 -m sglang.launch_server to the newer sglang serve CLI, making MEM_FRAC_STATIC configurable via environment variable, and adding several performance-tuning flags (fp8 KV cache, chunked prefill, cuda graph bounds, radix cache disable, etc.).
Security risks
None. These are benchmark configuration files and a shell script that runs in a controlled CI/benchmarking environment. No auth, crypto, permissions, or user-facing code is affected.
Level of scrutiny
Low scrutiny is appropriate. This is a config and benchmark tuning change following established patterns already present in the repo for other model configurations. The author has triggered a test run via /test-config.
Other factors
No bugs were found by the automated bug hunting system. No outstanding reviewer comments exist. The changes are self-contained and follow the same patterns as other benchmark scripts in the repository.
|
@functionstackx can you plz run this sweep ? (I tried to run this manually but seems not working..) |
|
/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys qwen3.5-fp8-mi355x-sglang |
|
@seungrokj Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23223448478 |
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
@cquil11 or @Oseltamivir can u help for day to day tasks. plz ping them |
waiting for the optimized upstream docker image.
Regards,
Seungrok