I'm running some CTF SpMV kernels using the contraction interface (a["i"] = B["ij"] * c["j"]), and I'm seeing performance that is nearly 2 orders of magnitude slower than systems like PETSc and Trilinos on large matrices from the suitesparse collection (such as the arabic-2005 graph). I know I've asked similar questions to this before, but I was wondering if such a discrepancy is expected, or if there is a specialized kernel available in CTF for the SpMV operation. I believe I've configured CTF correctly for my system, but I'm happy to share the configuration logs to double check.
cc @solomonik
I'm running some CTF SpMV kernels using the contraction interface (
a["i"] = B["ij"] * c["j"]), and I'm seeing performance that is nearly 2 orders of magnitude slower than systems like PETSc and Trilinos on large matrices from the suitesparse collection (such as the arabic-2005 graph). I know I've asked similar questions to this before, but I was wondering if such a discrepancy is expected, or if there is a specialized kernel available in CTF for the SpMV operation. I believe I've configured CTF correctly for my system, but I'm happy to share the configuration logs to double check.cc @solomonik