Add SGLang Cosmos3 serving docs by mickqian · Pull Request #174 · NVIDIA/cosmos

mickqian · 2026-06-01T09:54:39Z

Summary

add a Generator with SGLang section next to the existing Diffusers and vLLM-Omni serving paths
document SGLang serve commands for Cosmos3-Nano and Cosmos3-Super
list the current SGLang-supported Cosmos3 visual generation checkpoints and note follow-up gaps for sound, V2V, and action modes
include OpenAI-compatible T2I and T2V request examples

Notes

This section is intended to mirror the vLLM-Omni style while keeping the current SGLang support boundary explicit.

lfengad · 2026-06-02T07:34:43Z

@yogeshbalaji Is this on the plan of the landing page?

nv-dmajchrowski · 2026-06-02T12:25:01Z

+```shell
+sglang serve \
+  --model-path nvidia/Cosmos3-Super-Image2Video \
+  --num-gpus 4


If I'm not mistaken

sglang serve \ --model-path nvidia/Cosmos3-Super-Image2Video \ --num-gpus 4

is equivalent to CFG + ulysses-deg 2 i.e.

sglang serve \ --model-path nvidia/Cosmos3-Super-Image2Video \ --num-gpus 4 --enable-cfg-parallel --ulysses-degree 2

which is indeed preferred way to serve multi-gpu inference, but only if the model fits into single GPU (>80GB). This it only best setup for performance, but it doesn't reduce memory requirements.

Safer option would be to use fsdp as an example for Cosmos3-Super checkpoint, as this setup actually does reduce memory requirement by sharding the weights across gpus, i.e.:

sglang serve \ --model-path nvidia/Cosmos3-Super-Image2Video \ --num-gpus 4 --use-fsdp-inference

if we are looking for memory-friendly setups, yes we could do better, whether fsdp or offloading would do

nv-dmajchrowski · 2026-06-02T12:34:59Z

+```shell
+git clone https://github.com/sgl-project/sglang.git
+cd sglang
+pip install -e "python[diffusion]"


Can we make tag/stable release of the sglang repo and pin it here?
This command will always download top of tree sglang, which is not what we want as part of the README.

good point. I added an optional checkout step plus a version note. the default keeps tracking upstream SGLang to pick up ongoing Cosmos 3 fixes/performance improvements, while production or reproducible deployments should pin a release tag or known-good commit before install.

atharvajoshi10 · 2026-06-02T17:10:40Z

+
+| Model | Status | Notes |
+| --- | --- | --- |
+| `nvidia/Cosmos3-Nano` | Supported | Text-to-image, text-to-video, image-to-video |


Probably good to specify we support other modalities such as sound and action.

Updated the wording to mention that Cosmos 3 includes video-with-sound and action/policy models, while keeping this SGLang section scoped to the currently supported T2I/T2V/I2V generator serving paths.

atharvajoshi10 · 2026-06-02T17:12:01Z

+cd sglang
+# Optional: pin a release tag or known-good commit for reproducible deployments.
+# git checkout <release-tag-or-commit>
+pip install -e "python[diffusion]"


Probably best to support uv or venv

Added a venv setup before the editable SGLang install.

atharvajoshi10 · 2026-06-02T17:14:23Z

+job_id=$(curl -sS -X POST http://localhost:30000/v1/videos \
+  --form-string "prompt=A small warehouse robot moves a blue box across a clean floor." \
+  --form-string "negative_prompt=blurry, distorted, low quality" \
+  --form-string "size=1280x720" \
+  --form-string "num_frames=81" \
+  --form-string "fps=24" \
+  --form-string "num_inference_steps=35" \
+  --form-string "guidance_scale=4.0" \
+  --form-string "flow_shift=10.0" \
+  --form-string "seed=42" \
+  --form-string 'extra_params={"guardrails":true,"use_resolution_template":false,"use_duration_template":false}' \
+  | python -c 'import json, sys; print(json.load(sys.stdin)["id"])')
+
+while true; do
+  status=$(curl -sS "http://localhost:30000/v1/videos/${job_id}" \
+    | python -c 'import json, sys; print(json.load(sys.stdin)["status"])')
+  [ "$status" = "completed" ] && break
+  [ "$status" = "failed" ] && exit 1
+  sleep 1
+done
+
+curl -sS -L "http://localhost:30000/v1/videos/${job_id}/content" \
+  -o cosmos3_t2v_output.mp4


Can we add comments here to improve readability?

Added comments for the submit, poll, and download steps in the video example.

atharvajoshi10 · 2026-06-02T17:17:56Z

+while true; do
+  status=$(curl -sS "http://localhost:30000/v1/videos/${job_id}" \
+    | python -c 'import json, sys; print(json.load(sys.stdin)["status"])')
+  [ "$status" = "completed" ] && break
+  [ "$status" = "failed" ] && exit 1
+  sleep 1
+done


Is there a cleaner way to write this?
Maybe

until [ "$status" = completed ]; do status=$(curl -sS "http://localhost:30000/v1/videos/${job_id}" | jq -r .status) [ "$status" = failed ] && exit 1 sleep 1 done

Also Generation usually take 3 mins on single GPU instance so we can try sleeping for a longer duration.

Switched the polling snippet to a cleaner jq + until loop.

Increased the polling interval to 5 seconds to better match longer video generation times.

atharvajoshi10

Thanks, Let some minor formatting comments, looks good otherwise

mickqian added 3 commits June 1, 2026 17:54

Add SGLang Cosmos3 serving docs

df00036

Use async SGLang video API in Cosmos3 docs

52ff056

Update SGLang install snippet

44e38e7

mickqian marked this pull request as ready for review June 1, 2026 16:59

mickqian added 2 commits June 2, 2026 01:04

Remove vLLM reference from SGLang docs

76c4128

Simplify SGLang Cosmos3 serve examples

54bf140

nv-dmajchrowski reviewed Jun 2, 2026

View reviewed changes

Clarify SGLang install pinning guidance

499f629

atharvajoshi10 reviewed Jun 2, 2026

View reviewed changes

Update SGLang Cosmos3 README comments

03e80fe

lfengad requested a review from vinjn June 3, 2026 06:28

Conversation

mickqian commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Uh oh!

lfengad commented Jun 2, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mickqian Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atharvajoshi10 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mickqian commented Jun 1, 2026 •

edited

Loading

mickqian Jun 2, 2026 •

edited

Loading