Skip to content

fix(daemon): match D3D12 adapter on multi-GPU hosts (#225)#226

Merged
BANANASJIM merged 4 commits into
masterfrom
fix/225-d3d12-multi-gpu
May 16, 2026
Merged

fix(daemon): match D3D12 adapter on multi-GPU hosts (#225)#226
BANANASJIM merged 4 commits into
masterfrom
fix/225-d3d12-multi-gpu

Conversation

@BANANASJIM
Copy link
Copy Markdown
Owner

@BANANASJIM BANANASJIM commented May 15, 2026

Fixes #225.

Problem

_match_capture_gpu() in daemon_server.py previously matched GPUs only via the Vulkan vkEnumeratePhysicalDevices structured-data chunk. D3D12 captures don't emit that chunk, so the function fell through to return gpus[0]. On the reporter's multi-GPU Windows machine (NVIDIA RTX 4500 Ada discrete + AMD Radeon iGPU + WARP) gpus[0] was the AMD iGPU with ~2 GB VRAM, which is insufficient for the D3D12 heaps the capture needs — replay failed with E_INVALIDARG on heap creation.

A second latent bug was discovered alongside it: the remote-replay call site never passed structured data to _match_capture_gpu, so even Vulkan matching was silently bypassed in remote mode.

Fix

src/rdc/daemon_server.py

  • _match_capture_gpu(cap, sd=None, rd=None) short-circuits on 0/1 GPUs, then walks sd.chunks:
    • Vulkan path preserved (existing vkEnumeratePhysicalDevicesphysPropsdeviceName match).
    • New D3D12 path: chunks whose name contains DriverInit, EnumAdapters, or CreateDXGIFactory are searched via a new _find_adapter_description helper that walks one or two levels deep looking for AdapterDesc/pAdapter/adapter containers and Description/DeviceName strings, then matches by substring against gpu.name.
  • When structured-data matching produces no hit, the fallback now drops Software/WARP and sorts the rest by vendor priority nVidia > AMD > Intel via renderdoc.GPUVendor (accessed defensively with getattr + integer fallback so missing enum members don't crash).
  • Both call sites (local replay at the OpenCapture path, remote replay at the metadata-load path) now pass cap.GetStructuredData() and rd.

Test plan

tests/unit/test_daemon_server_unit.py — new TestMatchCaptureGpu class with 9 cases:

  1. Single GPU returns it
  2. Vulkan match by deviceName (regression guard for existing path)
  3. Vulkan chunk present but deviceName matches nothing → vendor fallback (pins contract for swapped/uninstalled hardware)
  4. D3D12 match via DriverInit/AdapterDesc
  5. D3D12 fallback skips WARP, prefers NVIDIA over AMD iGPU
  6. D3D12 fallback prefers AMD over Intel
  7. No structured data → vendor fallback (the literal _match_capture_gpu() falls back to integrated GPU on multi-GPU D3D12 systems, causing OpenCapture heap failure #225 scenario)
  8. Empty GPU list → None
  9. Remote-replay call site passes structured data (asserts via spy)
$ pixi run pytest tests/unit/test_daemon_server_unit.py::TestMatchCaptureGpu -v
collected 9 items
========================== 9 passed in 0.03s ==========================

$ pixi run test
2855 passed, 6 skipped in 15.29s
Total coverage: 92.73%

$ pixi run lint
All checks passed!  # ruff + mypy strict

Spec

Adds a "Multi-GPU capture replay" scenario under ### Requirement: Replay lifecycle in openspec/specs/daemon/spec.md. OpenSpec change docs under openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/.

Cannot verify locally

This is a Linux dev box without a Windows multi-GPU + D3D12 setup. Logic is exercised end-to-end via mocked structured data, but the exact D3D12 chunk name emitted by renderdoc 1.42 (DriverInit vs SystemChunk::DriverInit vs version variants) and the AdapterDesc child layout are educated guesses based on the renderdoc source. The matcher uses substring matching and a two-tier helper as defensive measures, but a real capture from the reporter is the only way to confirm the field names actually line up.

@YunHsiao — would you mind grabbing this branch (or waiting for the merged build) and re-running rdc open on the original D3D12 capture? If it still picks the iGPU, the chunk-name guess is wrong and we can iterate on the _find_adapter_description walk. Thanks for the very detailed bug report — forceGPUVendor / forceGPUDeviceID evidence made the root cause obvious.

Commits

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced GPU selection during capture replay on multi-GPU systems to intelligently match source hardware using capture metadata.
    • Implemented vendor-priority GPU fallback (NVIDIA > AMD > Intel) for improved replay accuracy when exact matches unavailable.
  • Documentation

    • Added multi-GPU capture replay scenario specifications and comprehensive test coverage documentation.

Review Change Stack

_match_capture_gpu previously only matched Vulkan vkEnumeratePhysicalDevices
chunks; D3D12 captures fell through to gpus[0], picking the integrated GPU
on multi-GPU Windows hosts and failing heap creation with E_INVALIDARG.

- Walk D3D12 DriverInit / EnumAdapters / CreateDXGIFactory chunks for
  AdapterDesc.Description / DeviceName
- Vendor fallback prefers discrete (nVidia > AMD > Intel) and skips
  Software / WARP via renderdoc.GPUVendor (defensive getattr with int
  fallbacks)
- Remote replay call site now also passes structured data and the
  renderdoc module to the matcher
- Adds 8 unit tests covering single-GPU, Vulkan, D3D12, fallback, and
  remote-call-site paths
Pins the contract that when a Vulkan chunk is present but its deviceName
does not match any available GPU, _match_capture_gpu falls through to the
vendor-priority fallback instead of returning the first GPU.
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@BANANASJIM has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 58 minutes and 28 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2f25b2a9-7b64-4d44-94f0-eaeb832f494e

📥 Commits

Reviewing files that changed from the base of the PR and between 45543e9 and dca42d8.

📒 Files selected for processing (2)
  • src/rdc/daemon_server.py
  • tests/unit/test_daemon_server_unit.py
📝 Walkthrough

Walkthrough

This PR implements GPU selection improvements for multi-GPU D3D12 capture replay, addressing a failure where replay falls back to integrated/WARP GPUs lacking sufficient VRAM. The fix adds D3D12 structured-data matching, vendor-priority fallback, and ensures both local and remote replay paths pass required parameters consistently.

Changes

Multi-GPU D3D12 Capture Replay

Layer / File(s) Summary
Proposal, specification, and test planning
openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/proposal.md, tasks.md, test-plan.md, openspec/specs/daemon/spec.md
Proposal documents the GPU matching algorithm using D3D12 structured-data adapter descriptions with case-insensitive substring matching, vendor-priority fallback (NVIDIA > AMD > Intel, excluding Software/WARP), and single-GPU short-circuit behavior. Tasks checklist and test plan enumerate implementation and validation steps. Daemon spec formalizes the "Multi-GPU capture replay" scenario.
GPU matching algorithm implementation
src/rdc/daemon_server.py
_find_adapter_description() helper extracts D3D12 adapter description from structured-data subtrees (DriverInit/EnumAdapters/CreateDXGIFactory chunks). Enhanced _match_capture_gpu() accepts optional rd module to enable vendor-priority selection when structured matching is inconclusive, skipping WARP adapters and preferring discrete vendors.
Replay call-site integration
src/rdc/daemon_server.py
Local replay GPU selection (line 291) and remote replay GPU probe (line 448) both pass rd module into _match_capture_gpu(), enabling consistent vendor-priority fallback behavior across replay paths.
Unit test coverage for GPU selection
tests/unit/test_daemon_server_unit.py
TestMatchCaptureGpu suite with nine test cases covers single-GPU return, Vulkan deviceName matching, D3D12 AdapterDesc matching, vendor-priority fallback (NVIDIA preference, WARP exclusion, AMD over Intel ordering), sd=None handling, empty GPU list, and remote-replay parameter propagation.

Sequence Diagram(s)

sequenceDiagram
  participant ReplaySetup as Replay Setup
  participant MatchGPU as _match_capture_gpu()
  participant FindAdapter as _find_adapter_description()
  participant GPUList as Available GPUs
  
  ReplaySetup->>MatchGPU: call with gpus, sd, rd
  MatchGPU->>MatchGPU: check single GPU short-circuit
  MatchGPU->>MatchGPU: try Vulkan vkEnumeratePhysicalDevices match
  MatchGPU->>FindAdapter: extract D3D12 adapter description
  FindAdapter->>FindAdapter: traverse DriverInit/EnumAdapters chunks
  FindAdapter->>MatchGPU: return adapter description string
  MatchGPU->>GPUList: case-insensitive substring match vs gpu.name
  MatchGPU->>MatchGPU: vendor-priority fallback: NVIDIA > AMD > Intel
  MatchGPU->>MatchGPU: exclude WARP from fallback candidates
  MatchGPU->>ReplaySetup: return selected GPU
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • BANANASJIM/rdc-cli#208: Extends GPU selection logic in _match_capture_gpu() and updates daemon_server.py call sites; this PR adds D3D12 structured-data matching and vendor-priority fallback enhancements on top of similar infrastructure.

Poem

🐰 Hop through the GPU maze with grace,
No WARP or iGPU's sad embrace!
D3D12 chunks now sing their names,
NVIDIA wins the vendor games! 🎮✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main fix: matching D3D12 adapters on multi-GPU hosts, directly addressing issue #225.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #225: D3D12 adapter matching via structured data, vendor-priority fallback (NVIDIA>AMD>Intel, skipping WARP), single-GPU short-circuit, signature updates to _match_capture_gpu, and remote-replay path fixes.
Out of Scope Changes check ✅ Passed All changes are directly within scope: GPU-matching logic improvements, supporting test cases, and specification updates for the multi-GPU replay scenario described in issue #225.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/225-d3d12-multi-gpu

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/proposal.md`:
- Around line 56-65: The fenced code block under the header "Scenario: Multi-GPU
capture replay" is missing a language identifier; update that fenced block to
include a language tag (e.g., markdown) so linters/renderers recognize it—locate
the triple-backtick block immediately before "#### Scenario: Multi-GPU capture
replay" in proposal.md and change it from ``` to ```markdown (or another
appropriate language) ensuring the rest of the block content remains unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa6a3ff4-2af6-4920-81d7-b4453672ece5

📥 Commits

Reviewing files that changed from the base of the PR and between 9ac831b and 45543e9.

📒 Files selected for processing (6)
  • openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/proposal.md
  • openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/tasks.md
  • openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/test-plan.md
  • openspec/specs/daemon/spec.md
  • src/rdc/daemon_server.py
  • tests/unit/test_daemon_server_unit.py

Comment on lines +56 to +65
```
#### Scenario: Multi-GPU capture replay
- WHEN the capture was taken on a multi-GPU system
- AND multiple GPUs are available for replay
- THEN the daemon selects the GPU whose name matches structured-data adapter
metadata extracted from the capture
- AND if no structured-data match exists, Software/WARP adapters are excluded
and the highest-ranked discrete GPU is preferred (nVidia > AMD > Intel)
- AND a single available GPU is always returned directly without inspection
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifier to fenced code block.

The fenced code block lacks a language identifier. Add markdown or another appropriate identifier to satisfy linting rules and improve rendering.

📝 Proposed fix
-```
+```markdown
 #### Scenario: Multi-GPU capture replay
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
#### Scenario: Multi-GPU capture replay
- WHEN the capture was taken on a multi-GPU system
- AND multiple GPUs are available for replay
- THEN the daemon selects the GPU whose name matches structured-data adapter
metadata extracted from the capture
- AND if no structured-data match exists, Software/WARP adapters are excluded
and the highest-ranked discrete GPU is preferred (nVidia > AMD > Intel)
- AND a single available GPU is always returned directly without inspection
```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 56-56: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/2026-05-15-fix-225-d3d12-multi-gpu/proposal.md` around lines
56 - 65, The fenced code block under the header "Scenario: Multi-GPU capture
replay" is missing a language identifier; update that fenced block to include a
language tag (e.g., markdown) so linters/renderers recognize it—locate the
triple-backtick block immediately before "#### Scenario: Multi-GPU capture
replay" in proposal.md and change it from ``` to ```markdown (or another
appropriate language) ensuring the rest of the block content remains unchanged.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@BANANASJIM BANANASJIM merged commit 3735f2b into master May 16, 2026
17 checks passed
BANANASJIM added a commit that referenced this pull request May 17, 2026
…#225) (#230)

PR #226's structured-data adapter guard never fired on real hardware.
Defects fixed here:

1. Chunk-name guard: the real renderdoc 1.42 chunk is
   "Internal::Driver Initialisation Parameters" (with a space), which
   none of the legacy markers ("DriverInit"/"EnumAdapters"/
   "CreateDXGIFactory") match. Add "Driver Initialisation Parameters"
   to d3d_markers; keep the legacy markers (commented as version-fallback)
   for other versions/paths.

2. Tree depth: the real Description lives at depth 3
   (chunk -> InitParams -> AdapterDesc -> Description), but the old
   walker only inspected depth 1/2. Replace it with a depth-bounded
   recursive descent (cap 5) returning an AdapterMatch carrying
   (description, device_id); DeviceId read via AsInt(). VendorId is
   deliberately not parsed (PCI id vs GPUVendor enum are unrelated, so
   it has no consumer).

3. Disambiguation: name-substring matching is fragile for same-vendor
   multi-GPU. _match_capture_gpu now selects in three tiers: exact
   DeviceId == g.deviceID (primary), name-substring (secondary),
   vendor-priority (tertiary, unchanged).

4. WARP/software sentinel: DXGI_ADAPTER_DESC.DeviceId == 0 on
   software/WARP/virtualized captures. Tier-1 is gated on truthiness
   (not "is not None") and skips any candidate with deviceID == 0, so a
   zero id can never bind the WARP adapter.

5. Version-drift diagnostic: a distinct _log.warning fires when a
   multi-GPU capture has structured data but no recognized vk/d3d
   adapter chunk, surfacing future renderdoc chunk renames instead of
   silently falling back to vendor priority.

Adds 14 unit tests (5 red-first defect proofs + 9 regression guards,
incl. the WARP-sentinel guard).
@BANANASJIM BANANASJIM deleted the fix/225-d3d12-multi-gpu branch May 19, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_match_capture_gpu() falls back to integrated GPU on multi-GPU D3D12 systems, causing OpenCapture heap failure

1 participant