-
Notifications
You must be signed in to change notification settings - Fork 9
feat(wan22): add WAN 2.2 text-to-video adapter and dataset for MLPerf inference #293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wu6u3tw
wants to merge
22
commits into
mlcommons:main
Choose a base branch
from
wu6u3tw:feat/wan22
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
8129aee
chore: ignore .worktrees/ directory
wu6u3tw 9ded0e6
feat(wan22): add WAN 2.2 text-to-video adapter, dataset, wire types, …
wu6u3tw ddac990
refactor: rename wan22 module to videogen, Wan22Adapter→VideoGenAdapt…
wu6u3tw fe068e4
fix: make response_format optional in VideoGenAdapter (default video_…
wu6u3tw a9d1703
chore(wan22): add MLPerf WAN2.2 prompt dataset (248 samples)
wu6u3tw cb1fe67
chore: apply post-rebase pre-commit autofixes
wu6u3tw 62b8dfb
fix(videogen): negative_prompt None default + exclude_none, add laten…
wu6u3tw 8061668
remove path in offline_wan22_lyris.yaml
wu6u3tw 32d7943
Remove name of the datacenter setup_and_test.sh
wu6u3tw e345c5b
Update and rename offline_wan22_lyris.yaml to offline_wan22.yaml
wu6u3tw e0ae775
fix(videogen): default response_format to video_path; align docs
wu6u3tw 8295afc
chore(wan22): bundle prompts dataset under example folder; fix stale …
wu6u3tw 9b6b70a
refactor(videogen): drop VideoGenDataset, ingest JSONL via generic lo…
wu6u3tw 837397d
fix(wan22): remove invalid metrics block from offline_wan22.yaml
wu6u3tw 763f65a
refactor(videogen): review polish — revert latent factory regression,…
wu6u3tw e960fa0
docs(videogen): align references with current module layout
wu6u3tw 7f645e3
refactor(videogen): drop hardcoded WAN 2.2 strings, trim example dataset
wu6u3tw 0a7d314
test(videogen): tmp_path mock video, drop unused videogen extra and s…
wu6u3tw d0b9a00
refactor(testing): expose route hook on EchoServer; videogen mocks re…
wu6u3tw 7b6f128
fix(videogen): tighten request/response wiring and fail loud on strea…
wu6u3tw 8ce9006
refactor(videogen): video_id belongs in metadata, not response_output
wu6u3tw 63e259a
fix(probe): reject api_type=videogen with a clear error
wu6u3tw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| # Offline Video Generation Benchmark for WAN 2.2 (GB200/GB300) | ||
| # | ||
| # Targets trtllm-serve POST /v1/videos/generations directly (no proxy). | ||
| # Uses response_format=video_path: server saves video to Lustre and returns | ||
| # the file path, avoiding large video byte payloads over HTTP/ZMQ. | ||
| # | ||
| # MLPerf inference parameters (text_to_video task): | ||
| # Resolution: 720x1280 (portrait) | ||
| # Duration: 81 frames = 5 s | ||
| # Steps: 20 denoising steps | ||
| # Guidance: 4.0 (primary CFG) / 3.0 (null-text secondary) | ||
| # Seed: 42 (fixed for reproducibility; combine with fixed_latent.pt) | ||
| # Dataset: 248 prompts from shopify_product_catalogue::q3vl | ||
| # | ||
| # Resolution / duration / steps / guidance / seed are defaulted on | ||
| # `VideoPathRequest`. Each JSONL row carries `prompt` plus the canonical | ||
| # MLPerf `negative_prompt`; both flow into `query.data` and serialise into | ||
| # the request body, while unset fields fall back to the request defaults. | ||
|
|
||
| name: "offline-wan22-video-generation-benchmark" | ||
| version: "1.0" | ||
| type: "offline" | ||
|
|
||
| model_params: | ||
| name: "wan22" | ||
| max_new_tokens: 1 # Ignored by VideoGenAdapter; kept >0 so swapping api_type to openai/sglang for debugging doesn't yield a 400. | ||
| streaming: "off" # WAN 2.2 uses non-streaming HTTP POST/response | ||
|
|
||
| datasets: | ||
| - name: wan22_prompts | ||
| path: examples/09_Wan22_VideoGen_Example/wan22_prompts.jsonl | ||
| type: "performance" | ||
| samples: 248 | ||
|
|
||
| settings: | ||
| runtime: | ||
| max_duration_ms: 600000 # 10 minute cap | ||
| scheduler_random_seed: 42 | ||
| dataloader_random_seed: 42 | ||
| n_samples_to_issue: 248 | ||
|
|
||
| load_pattern: | ||
| type: "max_throughput" | ||
|
|
||
| client: | ||
| num_workers: 4 | ||
|
|
||
| endpoint_config: | ||
| endpoints: | ||
| - "http://localhost:8000" | ||
| api_type: "videogen" | ||
| api_key: null | ||
|
|
||
| report_dir: logs/wan22_video_generation_benchmark | ||
|
wu6u3tw marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| #!/usr/bin/env bash | ||
| # setup_and_test.sh — End-to-end runbook for the WAN 2.2 video-generation example. | ||
| # | ||
| # Steps: | ||
| # 1. Download the WAN 2.2 weights from HuggingFace. | ||
| # 2. Launch trtllm-serve in a separate shell. | ||
| # 3. Run the offline benchmark from this script. | ||
| # | ||
| # Prerequisites: Python 3.12, a GPU host with trtllm-serve installed, | ||
| # and HuggingFace credentials (`huggingface-cli login`) — the WAN 2.2 | ||
| # weights are gated. | ||
|
|
||
| set -euo pipefail | ||
|
|
||
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| REPO_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)" | ||
|
|
||
| MODEL_REPO="Wan-AI/Wan2.2-T2V-A14B" # https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B | ||
| MODEL_DIR="${MODEL_DIR:-${HOME}/models/wan2.2-t2v-a14b}" | ||
|
|
||
| cd "${REPO_ROOT}" | ||
|
|
||
| # 1. Download model weights (~28 GB). | ||
| huggingface-cli download "${MODEL_REPO}" --local-dir "${MODEL_DIR}" | ||
|
|
||
| # 2. Launch the server in a separate shell, then re-run this script: | ||
| # | ||
| # trtllm-serve "${MODEL_DIR}" --host 0.0.0.0 --port 8000 \ | ||
| # --backend pytorch --task text_to_video | ||
| # | ||
| # 3. Run the offline benchmark. | ||
| inference-endpoint benchmark from-config \ | ||
| --config "${SCRIPT_DIR}/offline_wan22.yaml" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.