Skip to content

xpumd 1.3.6 SIGABRT on Arc Pro B70 (0xe223) — gmm_helper/resource_info.cpp:15 with newer compute-runtime 26.14 / libigdgmm12 22.9 #128

@cwhanlon

Description

@cwhanlon

Summary

xpumd from xpumanager 1.3.6 crashes with SIGABRT during initialize device manager on an Intel Arc Pro B70 (PCI ID 0xe223) running with the latest user-space compute stack (compute-runtime 26.14.37833.4, libigdgmm12 22.9). Standalone xpu-smi works on the same host — only the daemon path fails.

Environment

  • GPU: Intel Arc Pro B70 (BMG-G31, PCI 8086:e223) — single-GPU host
  • OS: Ubuntu 24.04.4 LTS (noble)
  • Kernel: 6.17.0-23-generic (HWE) — using the xe driver, not i915
  • Re-BAR: enabled (full 32 GiB BAR mapped)
  • Compute runtime: intel-opencl-icd 26.14.37833.4-0, libze-intel-gpu1 26.14.37833.4-0 (from intel/compute-runtime v26.14.37833.4 .debs)
  • IGC: intel-igc-core-2 2.32.7, intel-igc-opencl-2 2.32.7 (from intel/intel-graphics-compiler v2.32.7)
  • libigdgmm12: 22.9.0 (from compute-runtime release)
  • Level Zero loader: 1.21.9 (libze1 from repositories.intel.com/gpu/ubuntu noble unified)
  • xpumanager: v1.3.6xpumanager_1.3.6_20260206.143628.1004f6cb.u24.04_amd64.deb

clinfo correctly reports the device under this stack:

Platform #0: Intel(R) OpenCL Graphics
 `-- Device #0: Intel(R) Graphics [0xe223]
Driver Version: 26.14.37833.4
Global memory size: 32530182144 (30.3GiB)

Reproduction

  1. Fresh Ubuntu 24.04.4 with kernel 6.17 HWE on a host containing only an Arc Pro B70.
  2. Install Intel compute stack from the GitHub releases above.
  3. Install xpumanager_1.3.6_*.u24.04_amd64.deb.
  4. xpum.service starts; xpumd aborts ~3s later, before reaching device discovery.

Crash trace

xpumd: XPUM: Init xpum library
xpumd: XPU Manager:        1.3.6.20260206
xpumd: Build:                1004f6cb
xpumd: Level Zero:        1.21.9
xpumd: xpumd core starts to initialize
xpumd: initialize configuration
xpumd: xpum mode: xpum
xpumd: initialize datalogic
xpumd: initialize device manager
xpumd: Abort was called at 15 line in file:
xpumd: ../../neo/shared/source/gmm_helper/resource_info.cpp
systemd[1]: xpum.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: xpum.service: Failed with result core-dump.

Expected behavior

xpumd should initialize on Arc Pro B70 and expose the per-engine telemetry / temperatures / bandwidth that are unavailable through standalone xpu-smi alone.

Notes

  • The abort site (gmm_helper/resource_info.cpp:15) suggests xpumanager 1.3.6 bundles a NEO/GMM build that predates B70 (0xe223) device support, and is hitting an unhandled-device path during initialize device manager.
  • xpu-smi (the standalone CLI without the daemon) works correctly on this host and reports the device, power draw, frequency, and memory used. Engine utilization, temps, and bandwidth are N/A from xpu-smi alone — which is why the daemon would be useful here.
  • Compute-runtime 26.14.37833.4 (April 2026) is the first NEO release I am aware of that has 0xe223 in shared/source/dll/devices/devices_base.inl as a BmgHwConfig entry. xpumanager 1.3.6 (Feb 2026) likely bundles an older NEO snapshot that does not have it.

Workaround

Remove xpumanager and use the standalone xpu-smi package; partial telemetry, but does not crash.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions