Intel QuickAssist: multi-device utilization + software-fallback / Cavium fixes by dgarske · Pull Request #10772 · wolfSSL/wolfssl

dgarske · 2026-06-24T22:28:01Z

Changes

Interleave instances across devices. cpaCyGetInstances() returns instances grouped by device, so the per-thread round-robin piled thread counts below the instance count onto device 0. IntelQaInterleaveInstances() reorders by device so consecutive threads land on different devices. Default on; opt-out QAT_NO_DEV_INTERLEAVE.
Software-fallback fix. The NUMA allocator returned NULL when the QAT service isn't started, breaking RSA / TLS cert-verify (-142/-140/-173) whenever the device couldn't be opened. It now falls back to regular memory so crypto runs in software, gated by IntelQaIsStarted() so a live device still gets a clean error on real NUMA exhaustion.
Cavium/Nitrox req_count OOB write. wolfAsync_EventQueuePoll() did not reset req_count after the multi-request flush, indexing past multi_req.req[CAVIUM_MAX_POLL]. HAVE_CAVIUM-gated. (CWE-787, Project Vanessa.)
RSA public free used dev instead of dev->heap.
Docs (port/intel/README.md): sudo-free operation, serial make check, multi-device benchmark guidance, and a QAT health-diagnostics section.

Performance (3x Intel C62x, RSA-2048 sign, ops/sec)

The interleave spreads load across all 3 devices at thread counts below the instance count (18); neutral above that. AES unchanged vs master.

threads	before	after
6	1.72M (device 0 only)	2.01M (all 3 devices)
16	9.39M	12.66M (+35%)
18	15.7M	14.9M (noise)

…udo-free docs; Cavium async req_count OOB fix

…xes software-fallback RSA/cert-verify -142/-140/-173)

… rings, hugepages, AER, heartbeat)

…NDING_E gap; add QAT_NO_DEV_INTERLEAVE)

…note

…drop manual -j1 make check docs

Copilot

Pull request overview

This PR improves hardware-accelerated crypto offload behavior for Intel QuickAssist (QAT) and Cavium/Nitrox, focusing on better multi-device utilization, more robust software fallback when QAT isn’t available, and a Nitrox polling safety fix.

Changes:

Reorders QAT crypto instances to interleave across devices by default (opt-out via QAT_NO_DEV_INTERLEAVE) to improve utilization at lower thread counts.
Fixes software-fallback behavior in the QAT NUMA allocator when the QAT service isn’t started, allowing crypto to proceed in software.
Fixes a Cavium/Nitrox multi-request polling OOB condition by resetting req_count after buffer flush; also corrects an RSA public free heap parameter and expands Intel QAT documentation.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
wolfssl/wolfcrypt/port/intel/quickassist_mem.h	Adds an internal “is QAT started” query used by the QAT memory layer to decide when to fall back to regular memory.
wolfcrypt/src/port/intel/README.md	Updates Intel QAT usage docs (non-sudo operation, serialized testing guidance, multi-device benchmarking, diagnostics).
wolfcrypt/src/port/intel/quickassist.c	Adds `IntelQaIsStarted()` and instance interleaving logic; fixes RSA public free heap usage.
wolfcrypt/src/port/intel/quickassist_mem.c	Adds fallback to regular `malloc` when NUMA allocation fails and QAT service is not started.
wolfcrypt/src/async.c	Resets Cavium `req_count` after flushing multi-request poll buffer to avoid OOB writes.
Makefile.am	Serializes make execution when Intel QAT is enabled via `.NOTPARALLEL`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+/* Returns nonzero when the QAT crypto service is running. The memory layer
+ * uses this to decide whether a failed NUMA allocation should fall back to
+ * regular memory (service not started -> software mode) or remain NULL (real
+ * NUMA exhaustion while the device is in use). */
+int IntelQaIsStarted(void)
+{
+    return (g_cyServiceStarted == CPA_TRUE) ? 1 : 0;
+}


+        /* If the QAT memory subsystem is not available (async device not
+         * opened, e.g. "Running without async") fall back to regular memory
+         * so software crypto can proceed. A NULL while the subsystem IS up
+         * means real NUMA exhaustion and is left NULL so the QAT operation
+         * fails cleanly rather than receiving non-DMA memory. */
+        if (ptr == NULL && !IntelQaIsStarted()) {
+            isNuma = 0;
+            page_offset = QAE_NOT_NUMA_PAGE;
+            ptr = malloc(size + sizeof(qaeMemHeader));
+        }


dgarske added 6 commits June 24, 2026 15:27

QAT: device-interleave instances + RSA public free fix + make-check/s…

3ebe050

…udo-free docs; Cavium async req_count OOB fix

QAT: fall back to malloc for NUMA allocs when service not started (fi…

508526d

…xes software-fallback RSA/cert-verify -142/-140/-173)

QAT: document probing/diagnosing device health (adf_ctl, dmesg orphan…

30bf17a

… rings, hugepages, AER, heartbeat)

QAT: README accuracy fixes (software-fallback works; -173 not a WC_PE…

ff28dd6

…NDING_E gap; add QAT_NO_DEV_INTERLEAVE)

QAT: README latest C62x performance + correct testsuite/unit failure …

2285b93

…note

QAT: serialize build/check via .NOTPARALLEL when --with-intelqa set; …

db784f1

…drop manual -j1 make check docs

dgarske self-assigned this Jun 24, 2026

Copilot AI review requested due to automatic review settings June 24, 2026 22:28

Copilot started reviewing on behalf of dgarske June 24, 2026 22:28 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intel QuickAssist: multi-device utilization + software-fallback / Cavium fixes#10772

Intel QuickAssist: multi-device utilization + software-fallback / Cavium fixes#10772
dgarske wants to merge 6 commits into
wolfSSL:masterfrom
dgarske:qat_review

dgarske commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dgarske commented Jun 24, 2026

Changes

Performance (3x Intel C62x, RSA-2048 sign, ops/sec)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants