Skip to content

UCT/CUDA_IPC: Implement uct_ep_put_sgl_zcopy#11359

Draft
michal-shalev wants to merge 1 commit intoopenucx:masterfrom
michal-shalev:uct-cuda-ipc-sgl
Draft

UCT/CUDA_IPC: Implement uct_ep_put_sgl_zcopy#11359
michal-shalev wants to merge 1 commit intoopenucx:masterfrom
michal-shalev:uct-cuda-ipc-sgl

Conversation

@michal-shalev
Copy link
Copy Markdown
Contributor

@michal-shalev michal-shalev commented Apr 19, 2026

What?

Implement uct_ep_put_sgl_zcopy for the cuda_ipc transport.

Why?

Allow UCP SGL put offload to run over cuda_ipc, replacing N separate per-buffer copies with a single batched submission and reducing per-element overhead on multi-buffer transfers.

How?

On CUDA 13+, issue one cuMemcpyBatchAsync for the whole SGL, with all mappings tracked in a per-event uct_cuda_ipc_sgl_mapping_t and unmapped on completion. On older CUDA, fall back to per-element uct_cuda_ipc_ep_put_zcopy, arming the user completion only on the last element.

@michal-shalev michal-shalev self-assigned this Apr 19, 2026
@michal-shalev michal-shalev added the WIP-DNM Work in progress / Do not review label Apr 19, 2026
@michal-shalev michal-shalev marked this pull request as draft April 19, 2026 16:18
@michal-shalev michal-shalev force-pushed the uct-cuda-ipc-sgl branch 2 times, most recently from 81ed122 to 2101677 Compare April 26, 2026 15:17
@michal-shalev michal-shalev added Ready for Review and removed WIP-DNM Work in progress / Do not review labels Apr 26, 2026
@michal-shalev michal-shalev marked this pull request as ready for review April 26, 2026 15:18
@michal-shalev michal-shalev marked this pull request as draft April 26, 2026 16:54
@michal-shalev michal-shalev added the WIP-DNM Work in progress / Do not review label Apr 26, 2026
@openucx openucx deleted a comment from svc-nixl May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

WIP-DNM Work in progress / Do not review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant