Skip to content

Epic: shared helper module (displayxr-common) + unified capture API — kill the 6× Kooima/common drift #396

@dfattal

Description

@dfattal

Why this epic

The same native helper code — above all the Kooima off-axis projection math (display3d_view.{c,h}, camera3d_view.{c,h}, leia_math.h) — is vendored by copy in at least six places across the org and is already drifting (the clip-plane fix in #393 had to be hand-applied to one copy with no way to sync the others). The #389 work (VK native compositor now reliably composites many window-space layers; new windowspace_handle_*_win test apps) turns the window-space-layer + HUD + input scaffolding into a genuinely reusable building block too. This epic makes the shared code a versioned dependency so editing-a-local-copy becomes structurally impossible.

#393 originally scoped this to the two C++ demos and excluded the engines. That was too narrow: the engine plugins compile the same math C files. This epic supersedes that framing; #393 is re-scoped to be layer 2 (the C++ scaffolding) of the design below.

Drift inventory — every current copy

Copy location Repo Form Notes
test_apps/common/ displayxr-runtime C/C++ this repo's native test apps
common/ displayxr-demo-modelviewer C/C++ Vulkan [0,1] projection fix landed here
common/ displayxr-demo-gaussiansplat C/C++ + unmerged feat/zdp-clip-soft-fade-pick WIP (near/far, selftest, center_view)
include/ kooima-projection C the intended math home — seeded then abandoned (last push 2026-05-04)
Source/DisplayXRCore/Private/Native/ displayxr-unreal C compiled into the UE module (see ADR-003 UE-native off-axis projection)
native~/ (display3d_view.h, displayxr_kooima.{h,cpp}) displayxr-unity C++ → P/Invoke wraps the same Kooima math (see ADR-006 window-relative Kooima)

~20 of the ~28 common/ files are byte-identical between the two demos; nearly all divergence is accidental.

Architecture — one repo, two CMake targets (no new repo)

Repurpose the dormant kooima-projection repo → rename to displayxr-common, exposing two targets so the math core is shared by everything while engines never link the C++/Win32 scaffolding:

displayxr-common   (renamed from kooima-projection — net new repos: 0)
  ├─ displayxr::math     pure C: display3d_view, camera3d_view, leia_math — ZERO deps
  │     └─ linked by: displayxr-unreal (UE module), displayxr-unity (native~)   ← engines, math only
  └─ displayxr::common   C++ scaffolding (HUD/D2D, input/Win32, window mgr,
        xr_session_common, window-space-layer UI, stb, view_params,
        manifest cmake) — depends on displayxr::math
        (NOTE: atlas/screenshot capture is NOT shared here — it is removed in
         favour of a runtime API; see W6.)
        └─ linked by: runtime test apps, modelviewer, gaussiansplat            ← C++ apps

Consumed via CMake FetchContent pinned to a tag (same pattern already used for tinygltf/glm/OpenXR-loader). Local co-dev via FETCHCONTENT_SOURCE_DIR_DISPLAYXRCOMMON=../displayxr-common.

Why one repo, not two: FetchContent clones the whole repo regardless of target, so a separate math repo wouldn't make engines pull less source — the real link-level isolation comes from the displayxr::math target carrying no transitive Win32/D2D deps. A second repo would only earn its keep if the math needed an independent release cadence or had an external (non-DisplayXR) consumer; neither holds today. One repo = one CI, one tag, one branch-protection/CODEOWNERS/versions.json entry.

Divergence policy: mechanism in the lib, policy at the call site. Parameterize (e.g. display3d_compute_view(..., near_offset, far_offset, ...) outputs both the projection matrix and resolved near_z/far_z — modelviewer uses hardware clip, gauss feeds software cull); inject renderer-specific bits via callback; keep genuinely app-local files app-local (e.g. stb_image_impl_macos.cpp). Never #ifdef MODELVIEWER / #ifdef GAUSS inside the lib.

Workstreams & ordering

W1 — Reconcile the math core (blocking prerequisite; highest drift risk).
Land gauss's feat/zdp-clip-soft-fade-pick, then unify display3d_view.{c,h} + camera3d_view.{c,h} into one superset: Vulkan [0,1] projection + near_offset/far_offset API + near_z/far_z outputs + Unity's window-relative Kooima (ADR-006) + any Unreal delta. Validate against each consumer before landing.

W2 — Stand up displayxr-common. Rename kooima-projection; add the displayxr::math target seeded from the reconciled core; add Win+Mac CI on a tiny consumer harness; tag v0.1.0 (math only).

W3 — Adopt the math core everywhere (engines included). Replace each vendored math copy with the pin:

  • displayxr-runtime test apps (*_handle_*_win, windowspace_handle_*_win)
  • displayxr-demo-modelviewer
  • displayxr-demo-gaussiansplat
  • displayxr-unreal (UE Build.cs references the fetched C core instead of Private/Native/)
  • displayxr-unity (native~ build compiles the fetched C core)

W4 — Extract the C++ scaffolding layer (this is re-scoped #393). ✅ DONE 2026-06-06. Reconciled hud/input/xr_session deltas into the lib superset; added the single platform-gated displayxr::common target (displayxr-common#5); migrated the three C++ consumers (runtime #462, gauss #34, modelviewer #24 — each common/ deleted down to a thin shim + app-local leia_math.h, ~45k duplicated lines removed). Tagged v0.3.0 (the planned v0.2.0 was taken by W3's Layer-1 work). All user-verified on the Leia display. Details: W4-complete comment.

W5 — Release discipline. ✅ DONE 2026-06-05/06. displayxr-common tags disciplined since v0.1.0 (now v0.3.0); consumers bump the pin on their own cadence. The drift-guard went from optional to shipped: no-vendored-math.yml in all 5 consumers (math filenames), extended 2026-06-06 in the 3 C++ consumers to also catch re-vendored scaffolding filenames.

W6 — Replace app-side capture with a runtime API (supersedes sharing atlas_capture via the lib).
The I-key / Ctrl+Shift+C screenshot is reimplemented 6× today — 5 per-API readbacks in test_apps/common/atlas_capture_{d3d11,d3d12,gl,vk,metal} (forked byte-for-byte into both demos' common/), plus Unity Runtime/DisplayXRScreenshot.cs and Unreal DisplayXRAtlasCapture.cpp. Rather than fold atlas_capture into displayxr::common, the runtime gains one official xrCaptureAtlasEXT (new XR_EXT_atlas_capture, PROJECTION_ONLY / POST_COMPOSE flag) and every app deletes its readback. Only the platform flash-overlay + filename-numbering UX stays app-local. Full design + per-repo deletion list: docs/roadmap/unified-atlas-capture.md.

Scope note: W6 expands this epic beyond the original Kooima/common math dedup — it adds a runtime OpenXR extension and engine capture reimplementation (the engines' capture code was never in the math drift inventory). Folded here per the shared common/ surface and migration coordination; the runtime extension itself is independently releasable and need not gate on W1–W5.

W7 — RAW inputs / rig generators / render-ready output (spike → design → implement).
A design-first workstream that reshapes how view geometry is served, and changes what displayxr::math is for. Design doc WRITTEN 2026-06-06: docs/roadmap/raw-vs-render-ready-views.md (e99f0c39b + terminology/per-view pass 7710d5bd6); model summary:

  • Output is a fixed point: render-ready = standard XrView { pose, skewed XrFovf }, already converged on the plane — rig-agnostic. Legacy/non-aware apps and the runtime's weaver consume exactly this.
  • Rig is an input-side generator, not an output field. Display-centric and camera-centric are two paradigms that route raw inputs → the same output shape. They live in displayxr::math (re-roling W1: display3d_view = display-centric rig, camera3d_view = camera-centric rig — two intentional generators, not two drifting copies). Future rigs = new generators, no API/output change.
  • Raw = the generator inputs: eye positions, display-plane pose, effective canvas rect (handle = window, texture = subrect — no app-class branching), timestamp_ns, is_tracking. The escape hatch for any future modifier (IPD/parallax/convergence/ortho/clip) the runtime can't anticipate — promise complete inputs + a stable output, predict nothing.
  • Runtime links displayxr::math to compute render-ready, so "runtime's render-ready" ≡ "aware app's own computation from raw under the default rig" by construction — kills the "runtime view ≠ my view" bug class.

Session findings (code as of 2026-06-02 — verified, with file:line):

  • The runtime IS the suspected 7th copy. src/xrt/auxiliary/math/m_camera3d_view.{c,h} + m_display3d_view.{c,h} + m_multiview.{c,h} are FOV-only xrt-typed ports of app-side test_apps/common/camera3d_view.c / display3d_view.c (header: "Runtime-side port (xrt types, FOV-only — no matrices)"); consumed by oxr_session.c (#include "math/m_camera3d_view.h" :42). Hand-synced → real drift. ⇒ displayxr::math must expose two layers: FOV-only (runtime consumer) and FOV+matrix (app consumers).
  • Render-ready already ships — as a full TWO-rig system with live modifiers, default camera-centric @ 0.5 D (CORRECTED — the earlier "always display-centric" note was wrong). oxr_session.c:1384-1426 branches on view_state.camera_mode: camera3d_compute_views() (camera-centric) vs display3d_compute_views() (display-centric), mirrored server-side at ipc_server_handler.c:~504. Default is camera-centric, cam_convergence = 0.5 D / 2 m (qwerty_device.c:642,647) — i.e. the intended legacy default already holds. External-window apps (handle/texture, real HWND) are forced display-centric via the !sess->has_external_window gate (window = portal). WebXR-over-IPC gets render-ready (bridge runs on d3d11_service, so the normal path — not the narrow xc==NULL headless raw fast-path at ipc_server_handler.c:415-435).
  • The rig + tunables are driven ONLY by the qwerty debug device, not an app API — that is the actual W7 gap. qwerty_view_state (camera_mode + cam/disp spread/parallax/convergence, qwerty_device.h:51-60) is read by the compositors (comp_gl:872, comp_metal:1238, comp_multi_system:1925) and fed into the view computation. Rig toggle = P key (qwerty_win32.c:546-549 / qwerty_macos.m:431-434qwerty_toggle_camera_mode, qwerty_device.c:1066); convergence/spread on other keys. Apps cannot select rig or set convergence/IPD through OpenXR today.
  • Measured-vs-predicted is already plumbed at the DP. Single accessor xrt_display_processor::get_predicted_eye_positions() (xrt_display_processor.h:145) → xrt_eye_positions already carries timestamp_ns + is_tracking + valid (xrt_display_metrics.h:51-60); MANAGED/MANUAL (docs/specs/vendor/eye-tracking-modes.md) is the predict-vs-passthrough knob. ⇒ no new DP accessor; just surface timestamp_ns/is_tracking into the raw channel (stops at the compositor today).

Reframed W7 goal: render-ready + two rigs + modifiers + camera-centric-0.5D default already exist and are correct — but controllable only via the qwerty debug keyboard. W7 = promote that control to an app-facing OpenXR extension (app selects rig + sets convergence/spread/parallax via XrViewLocateRigEXT), add a raw channel for aware apps that bring their own generator, and keep qwerty as the dev/default driver. Not "add render-ready."

Open decisions for the design doc:

  • First impl-session check — ANSWERED: cube_handle_* ignore XrView.fov in 3D mode and recompute from .pose + their own displayxr::math calls (cube_handle_d3d11_win/main.cpp:773-775); the runtime fov is consumed only in mono/legacy mode. So the rig request makes the existing render-ready output finally consumable, and the raw channel formalizes what 3D apps already treat .pose as.
  • Default is settled (camera-centric @ 0.5 D, already in code). Decided: chaining a rig descriptor lifts the external-window forcing — an explicit XrCameraRigEXT is the knowledge the guard was substituting for; sessions that chain nothing keep today's behavior exactly.
  • API shape decided in the doc — one extension XR_EXT_view_rig (types 1000999140-142): request = chain exactly one of XrDisplayRigEXT{pose, virtualDisplayHeight, ipdFactor, parallaxFactor, perspectiveFactor} / XrCameraRigEXT{pose, ipdFactor, parallaxFactor, convergenceDiopters, verticalFov} on XrViewLocateInfo::next; result = XrViewDisplayRawEXT (raw display-space eyes + display pose + canvas rect + sampleTimeNs + isTracking) on XrViewState::next. Descriptors carry no clip and no placement params. Rig state is per-session; qwerty = debug fallback for non-extension sessions. Open sub-questions tracked in the doc (raw chain point, clamp-vs-reject, workspace hook, tracked-vs-synthesized raw eyes).
  • Make the runtime a displayxr::math consumerdecided + designed (doc section "Equivalence by construction"): type-neutral core layer in displayxr-common (own dxr_* types, zero OpenXR dep) + byte-compatible OpenXR-typed wrapper (the 5 existing pins untouched) + new xrt-typed wrapper replacing m_display3d_view/m_camera3d_view/m_multiview. Lands with the W7 implementation so the rig path is born on the shared core.

Status: design doc LANDED 2026-06-06 — implementation is a follow-up session (extension wiring + per-session state + next-chain walk in oxr_xrLocateViews + the math fold-in).

Per-repo touch list (incl. native-build changes)

Repo Change Native build impact
kooima-projection → displayxr-common rename, two targets, CI, tags
displayxr-runtime test apps consume math + common pin CMake only
displayxr-demo-modelviewer delete common/, pin CMake only
displayxr-demo-gaussiansplat land WIP, delete common/, pin CMake only
displayxr-unreal drop Private/Native/ math, pin + W6: drop DisplayXRAtlasCapture readback Build.cs
displayxr-unity drop native~ math copy, pin + W6: gut DisplayXRScreenshot.cs readback native~ CMake / build-win.bat
displayxr-runtime W6: XR_EXT_atlas_capture + delete atlas_capture_* runtime + CMake
displayxr-extensions W6: header auto-sync
displayxr-runtime W7: fold m_multiview/m_display3d_view into displayxr::math (runtime becomes a math consumer); add raw-channel + rig-select to xrLocateViews runtime + CMake

Decisions taken

  • One repo, two CMake targets (not two repos) — avoids proliferation; link isolation via targets.
  • Repurpose existing kooima-projection (net zero new repos); it already holds the math files.
  • FetchContent-by-tag; independent per-consumer cadence preserved.
  • Capture is removed, not shared: the atlas_capture family is deleted in favour of a runtime xrCaptureAtlasEXT (W6), so it is intentionally excluded from displayxr::common.
  • The runtime is the 7th math consumer (W7): m_multiview/m_display3d_view are FOV-only ports of the app-side generators; fold into displayxr::math so render-ready ≡ app-from-raw by construction. display3d_view/camera3d_view are the two canonical rig generators, not drifting copies.

Open questions — all resolved

  • Exact lib surface → settled in W4: everything goes in including the stb impl TUs and the macOS platform glue (the lib owns the only STB_IMAGE[_WRITE]_IMPLEMENTATION per platform); only leia_math.h stays app-local. Extension headers via consumer-provided DISPLAYXR_EXTENSIONS_INCLUDE_DIR.
  • Versioning → settled in practice: tag per meaningful change (v0.1.0v0.3.0), consumers pin and bump on their own cadence; no minimum-version gating needed so far.
  • Engine math consumption mechanics → settled in W3: Unreal = git submodule at Source/ThirdParty/displayxr-common (UBT has no per-file source exclusion; #include shims compile the C core), Unity native~ = standard FetchContent.

Related: #393 (re-scoped to W4), #389 (window-space-layer building block), docs/roadmap/unified-atlas-capture.md (W6 capture-API design + per-repo deletion list), docs/roadmap/raw-vs-render-ready-views.md (W7 view-model design — landed 2026-06-06), displayxr-unity ADR-006, displayxr-unreal ADR-003.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions