Preserving hardware memory during cuvid decoding, exporting/importing via dlpack. by caffeinism · Pull Request #2155 · PyAV-Org/PyAV

caffeinism · 2026-02-04T16:35:56Z

Hello? I'm a user with limited knowledge of libav, dlpack, and cython. However, recognizing this as a necessary feature, I drafted this with the help of an LLM.

Motivation

If an application decodes video, performs GPU operations, and then re-encodes it, PyAV currently incurs a significant amount of memcopy. (GPU (cuvid) -> CPU (PyAV) -> GPU (Torch, etc.) -> CPU (PyAV) -> GPU (nvenc)) However, if we could export frames decoded by cuvid to dlpack while keeping them on the GPU, we wouldn't need to move the frames to CPU memory.

I passed all existing tests, but with such extensive modifications, it seems difficult for a beginner like me to catch every single detail. However, since most changes involve adding features rather than modifying existing ones, I hope this PR serves as a good starting point.

Usage example

import av
from av.codec.hwaccel import HWAccel
import torch

hwaccel = HWAccel(
    device_type="cuda",
    device=0,
    allow_software_fallback=False,
    output_format="hw", # preserve hw memory
)

# decode using cuvid
with av.open(from_video_filename, "r", hwaccel=hwaccel) as c:
    frame = next(c.decode(video=0))
    y = torch.from_dlpack(frame.planes[0]) # device(type='cuda', index=0), torch.uint8, torch.Size([H, W])
    uv = torch.from_dlpack(frame.planes[1]) # device(type='cuda', index=0), torch.uint8, torch.Size([H/2, W/2])

f = av.VideoFrame.from_dlpack(((y*0.5).to(torch.uint8), uv)) # some operation

with av.open(to_video_filename, "w") as c:
    s = c.add_stream("h264_nvenc", rate=24) # encode using nvenc
    for it in s.encode(f):
        c.mux(it)
    for it in s.encode(None):
        c.mux(it)

caffeinism · 2026-02-04T18:47:06Z

@WyattBlue If I add tests, will it work fine even if it only runs on a CUDA machine? I don't think it will work in the GitHub workflow.

WyattBlue · 2026-02-04T18:48:14Z

You need to test the interface. For example, hw_format does not have an pyi interface, and writing a test would catch that fact.

WyattBlue · 2026-02-04T18:50:52Z

av/hwcontext.pxd‎ should be merged with include/avutil. *.pxd files should otherwise not be free radicals, i.e., they should have a corresponding real .py file.

caffeinism · 2026-02-05T02:34:53Z

You need to test the interface. For example, hw_format does not have an pyi interface, and writing a test would catch that fact.

Could you please explain it in a bit more detail?

av/hwcontext.pxd‎ should be merged with include/avutil. *.pxd files should otherwise not be free radicals, i.e., they should have a corresponding real .py file.

In this case, how should dlpack.pxd be handled? Should this also be moved to the include directory?

caffeinism · 2026-02-05T08:54:10Z

@WyattBlue Could you take a look at the last commit section? I modified the buffer creation logic for frames generated by VideoFrame(), VideoFrame.from_ndarray(), and VideoFrame.reformat to support dlpack.

caffeinism added 3 commits February 5, 2026 01:39

Impl __dlpack__, keep cuda memory

fda4962

Impl VideoFrame.from_dlpack

aaa90db

Impl minimal support device_id

56dd2dc

caffeinism force-pushed the dlpack branch from 5e0f429 to 56dd2dc Compare February 4, 2026 16:39

ruff / isort

9426057

caffeinism force-pushed the dlpack branch from 3ef7b26 to 9426057 Compare February 4, 2026 16:44

WyattBlue added the needs tests This PR needs a test label Feb 4, 2026

Merge av/hwcontext.pxd into include/libavutil/avutil.pxd

b22e7a5

caffeinism added 4 commits February 5, 2026 11:38

Move av/dlpack.pxd to include/dlpack.pxd

f713f95

Add tests/test_dlpack.py

d87422a

Fix interfaces

0af7dcf

Create VideoFrame using av_frame_get_buffer instead of av_image_alloc

a4a03ae

caffeinism force-pushed the dlpack branch from c346b18 to a4a03ae Compare February 5, 2026 08:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserving hardware memory during cuvid decoding, exporting/importing via dlpack.#2155

Preserving hardware memory during cuvid decoding, exporting/importing via dlpack.#2155
caffeinism wants to merge 9 commits intoPyAV-Org:mainfrom
caffeinism:dlpack

caffeinism commented Feb 4, 2026 •

edited

Loading

Uh oh!

caffeinism commented Feb 4, 2026

Uh oh!

WyattBlue commented Feb 4, 2026 •

edited

Loading

Uh oh!

WyattBlue commented Feb 4, 2026

Uh oh!

caffeinism commented Feb 5, 2026

Uh oh!

caffeinism commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

caffeinism commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Usage example

Uh oh!

caffeinism commented Feb 4, 2026

Uh oh!

WyattBlue commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WyattBlue commented Feb 4, 2026

Uh oh!

caffeinism commented Feb 5, 2026

Uh oh!

caffeinism commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

caffeinism commented Feb 4, 2026 •

edited

Loading

WyattBlue commented Feb 4, 2026 •

edited

Loading