[v2.6.0] Sync release-v2.6.0 to main#657
Draft
shubhadeepd wants to merge 4 commits into
Draft
Conversation
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Plumb a per-NIM podAnnotations field from values.yaml through to
NIMService.spec.podAnnotations so users can attach pod-level
annotations to NIM workloads. Default is {} (omits the field), so
existing deployments render identically.
Primary motivator is Runai fractional GPU saving-mode, which requires
both gpu-fraction-style annotations on the pod AND fractional GPU
resources, e.g.:
nimOperator:
nim-llm:
podAnnotations:
gpu-fraction: "0.25"
gpu-fraction-num-devices: "1"
resources:
limits: { runai.com/gpu: 1 }
requests: { runai.com/gpu: 1 }
Templates touched: llm-nim, embedding-nim, reranking-nim, vlm-nim,
vlm-captioning-nim, vlm-embed-nim, vlm-reranker-nim. Each gains the
podAnnotations: {} default and a usage comment in values.yaml.
(cherry picked from commit ab4cddf)
Signed-off-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
Co-authored-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
(cherry picked from commit b1ea5e8)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 51d5caf) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This draft PR prepares
release-v2.6.0for merge intomainwhile avoiding a regular merge-conflict-heavy integration path.The branch was created from the current
origin/main, then theorigin/release-v2.6.0tree was applied. After that, I reconciled changes that existed only onmainand restored the ones that were still relevant and not superseded by release work.The PR branch has now been refreshed with the latest
origin/release-v2.6.0at51d5caf.Current Branch State
codex/release-v2.6.0-to-main-20260530mainorigin/mainat6fd878aorigin/release-v2.6.0at51d5cafbcc33e4Sync Strategy
origin/main.origin/release-v2.6.0release tree onto the branch.main-only commits and file contents againstrelease-v2.6.0.main-only changes that were still valid.origin/release-v2.6.0.Commits In This PR
e32abe9-chore: prepare release-v2.6.0 sync to main8a6d690-chore: keep release image paths stagednvcr.io/nvstaging/blueprint/....20c877d-Helm: expose podAnnotations on all NIMService templates (#658)b1ea5e8.bcc33e4-fix: move vlm reranker host port (#656)51d5caf.Main-Only Changes Preserved
CI and automation
.github/workflows/request-nvskills-ci.yml..github/workflows/cve-create-pr.yml.Examples
examples/google-cloud-netapp-volumes-data-ingestor/.examples/README.md.Documentation and release history
docs/perf-benchmarks.mddocs/assets/perf-benchmarks/*.pngdocs/index.md.docs/scripts/build_multiversion_docs.*docs/scripts/verify_doc_version_manifest.pydocs/versions1.jsonwhile keeping2.6.0as the current preferred version.2.5.1release note section indocs/release-notes.mdwhile keeping the2.6.0release notes at the top.Vidore-V3naming in accuracy benchmark docs.Deployment helpers
main:deploy/compose/nemotron3-super.envdeploy/compose/nemotron3-super-cloud.envdeploy/compose/nemotron3-super-prompt.yamldeploy/helm/nvidia-blueprint-rag/nemotron3-super-values.yamldeploy/helm/nvidia-blueprint-rag/nemotron3-super-rtx6000-values.yamlImage Path Decision
Deployment image paths are intentionally kept on staging repositories for now:
nvcr.io/nvstaging/blueprint/ingestor-servernvcr.io/nvstaging/blueprint/rag-servernvcr.io/nvstaging/blueprint/rag-frontendThis matches the current
release-v2.6.0branch state. The release branch is expected to receive a separate update that moves these paths to the publicnvcr.io/nvidia/blueprint/...repositories.Files checked for this decision:
deploy/compose/docker-compose-ingestor-server.yamldeploy/compose/docker-compose-rag-server.yamldeploy/workbench/compose.yamldeploy/helm/nvidia-blueprint-rag/values.yamlMain-Only Changes Not Restored
These were reviewed and left out because release-v2.6.0 appears to replace them with newer implementations:
docs/vlm-embed.mddocs/multimodal-retriever.mdas the replacement documentation path.src/nvidia_rag/utils/minio_operator.pysrc/nvidia_rag/utils/object_store.pyand the newer SeaweedFS/object-store configuration.Validation Performed
git diff --check origin/main..HEADpython3 docs/scripts/verify_doc_version_manifest.py2.6.0.rg.origin/release-v2.6.0.Reviewer Notes
Please pay particular attention to:
main-only CI workflows should remain inmainafter the release sync.docs/vlm-embed.mdsrc/nvidia_rag/utils/minio_operator.pyOperational Note
This PR is intentionally draft. Copy-pr-bot reported that auto-sync is disabled for draft PRs in this repository, so workflows may need to be run manually.