feat: use pre-generated custom dataset for benchmarking MTP with chat template by richardhuo-nv · Pull Request #63 · NVIDIA/srt-slurm

richardhuo-nv · 2026-04-23T17:09:31Z

This is a workaround for adding the chat template for MTP based benchmarking for GLM-5 model.

GLM-5 is using a custom tokenizer, invoking it multiple times will cause pmix failure at runtime.

Workaround is pre-generate the dataset with chat template and use it for MTP based benchmarking.

The config is like this:

benchmark:
  type: "sa-bench"
  isl: 1024
  osl: 1024
  concurrencies: "8192"
  req_rate: "inf"
  dataset_name: "custom"
  dataset_path: "/glm5_datasets/glm5-1024-1024-100000-ratio-1_for_serve.json"
  custom_tokenizer: "glm_moe_dsa"

extra_mount:
  - "/lustre/fsw/core_dlfw_ci/rihuo/glm5_dataset:/glm5_datasets"

fix fix fix fix

add custom dataset

b6fe681

fix fix fix fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use pre-generated custom dataset for benchmarking MTP with chat template#63

feat: use pre-generated custom dataset for benchmarking MTP with chat template#63
richardhuo-nv wants to merge 1 commit intosa-submission-q2-2026from
rihuo/add_custom_dataset_for_sa

richardhuo-nv commented Apr 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

richardhuo-nv commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

richardhuo-nv commented Apr 23, 2026 •

edited

Loading