Summary
The conda-cpp-tests ARM job on main fails immediately when launching STREAM_TEST because libcugraph_mg.so has an unresolved dependency on a helper instantiation that is not available at load time.
Failing job:
Failure signature:
STREAM_TEST: symbol lookup error:
/opt/conda/envs/test/bin/gtests/libcugraph/../../../lib/libcugraph_mg.so:
undefined symbol: _ZN7cugraph6detail13sequence_fillIiEEvRKN3rmm16cuda_stream_viewEPT_mS6_
Demangled symbol:
void cugraph::detail::sequence_fill<int>(
rmm::cuda_stream_view const&,
int*,
unsigned long,
int)
Observed behavior
STREAM_TEST itself is a single-GPU test, but it loads libcugraph.so, which currently pulls in libcugraph_mg.so through the runtime dependency graph. libcugraph_mg.so then fails to load because it references cugraph::detail::sequence_fill<int> without that symbol being resolvable in this environment.
Likely root cause
libcugraph_mg.so depends on helper/template instantiations that are owned outside the MG library. This creates a fragile cross-DSO symbol dependency that can fail depending on link/load behavior and platform.
Possible fix direction
#5502 addresses this class of failure by introducing libcugraph_common.so to own shared helper/explicit-instantiation symbols and by separating single-GPU, multi-GPU, and MTMG source ownership.
The specific fix pattern is:
- move shared helper instantiations such as
sequence_fill<int> into a common DSO
- make both
libcugraph.so and libcugraph_mg.so depend on that common DSO
- avoid relying on
libcugraph_mg.so resolving helpers from libcugraph.so
Summary
The
conda-cpp-testsARM job onmainfails immediately when launchingSTREAM_TESTbecauselibcugraph_mg.sohas an unresolved dependency on a helper instantiation that is not available at load time.Failing job:
Failure signature:
Demangled symbol:
Observed behavior
STREAM_TESTitself is a single-GPU test, but it loadslibcugraph.so, which currently pulls inlibcugraph_mg.sothrough the runtime dependency graph.libcugraph_mg.sothen fails to load because it referencescugraph::detail::sequence_fill<int>without that symbol being resolvable in this environment.Likely root cause
libcugraph_mg.sodepends on helper/template instantiations that are owned outside the MG library. This creates a fragile cross-DSO symbol dependency that can fail depending on link/load behavior and platform.Possible fix direction
#5502 addresses this class of failure by introducing
libcugraph_common.soto own shared helper/explicit-instantiation symbols and by separating single-GPU, multi-GPU, and MTMG source ownership.The specific fix pattern is:
sequence_fill<int>into a common DSOlibcugraph.soandlibcugraph_mg.sodepend on that common DSOlibcugraph_mg.soresolving helpers fromlibcugraph.so