Skip to content

[EXPERIMENTAL] Add RV32-IM backend#1119

Draft
hanno-becker wants to merge 1 commit into
mainfrom
rv32im_backend
Draft

[EXPERIMENTAL] Add RV32-IM backend#1119
hanno-becker wants to merge 1 commit into
mainfrom
rv32im_backend

Conversation

@hanno-becker

@hanno-becker hanno-becker commented May 15, 2026

Copy link
Copy Markdown
Contributor

This commit adds an experimental arithmetic backend for RISCV32-IM:

The backend includes:

  • A 2+2+2+2 forward NTT
  • A 2+2+2+2 inverse NTT
  • A poly-poly base multiplication (but not vector-vector)

Modular arithmetic in the NTT/inverse-NTT uses Barrett multiplication
as in the AArch64 backend. A notable difference is that RV32-IM does
not provide SQRDMULH, so we fall back to a plain MULH: This affects
the 'magic constant' slightly, and also increases the output bound
of the modular multiplication to 5/4 * MLDSA_Q -- this is already
detailed in proofs/isabelle/neon_ntt. The larger multiplication bound
is reflected in a larger output bound for the forward NTT, which has
already been adopted in a previous commit and is known not to cause
conflicts with the rest of the code. For the inverse NTT, however,
a bound > MLDSA_Q for the output is problematic. Hence, for the final
scaling step in the inverse NTT, we 'emulate' an SQRDMULH using
MULH + ADD + SRAI. This kernel still needs to be proved correct
in the 'Neon' NTT paper adaptation in proofs/isabelle/neon_ntt.

We are interest in different classes of RV32-IM CPUs, some with fast,
some with slow multipliers. For CPUs with very slow multiplier, such
as Hummingbird E203 (https://github.com/riscv-mcu/e203_hbirdv2), cycles
can be saved by leveraging MLDSA_Q = 2^23 - 2^13 + 1 and rewriting
the low-multiplication by MLDSA_Q as a sequence of SHIFT+ADD/SUB.
We offer this alternative version of the backend under a compile-time
flag MLD_USE_NATIVE_RV32IM_SLOW_MULTIPLIER.

CI runs the functional, KAT, ACVP and wycheproof suites under
qemu-riscv32 for both the default and the slow-multiplier variant.

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-65, REDUCE-RAM)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 1625s 1583s +2.7%
mld_invntt_layer 185s 178s +4%
poly_pointwise_montgomery_c 153s 141s +9%
rej_uniform_native 129s 129s +0%
polyvec_matrix_pointwise_montgomery_yvec 83s 83s +0%
mld_ct_memcmp 73s 66s +11%
fqmul 44s 43s +2%
mld_ntt_layer 44s 44s +0%
polyveck_chknorm 41s 39s +5%
mld_attempt_signature_generation 27s 21s +29%
mld_ntt_butterfly_block 24s 22s +9%
keccakf1600x4_permute_native 22s 24s -8%
poly_chknorm_c 20s 18s +11%
polyt0_unpack 17s 15s +13%
polyveck_decompose 17s 18s -6%
rej_uniform_c 16s 17s -6%
sign_verify_internal 16s 16s +0%
mld_check_pct 15s 16s -6%
polyvecl_chknorm 14s 16s -12%
poly_add 11s 11s +0%
poly_uniform_eta_4x 11s 11s +0%
keccak_absorb_once_x4 10s 9s +11%
poly_invntt_tomont_c 10s 10s +0%
polyvec_matrix_pointwise_montgomery_row 9s 10s -10%
polyveck_caddq 9s 6s +50%
polyvecl_ntt 9s 8s +12%
sign_verify_pre_hash_internal 9s 5s +80%
pointwise_acc_native_aarch64 8s 5s +60%
polyveck_invntt_tomont 8s 7s +14%
polyz_unpack_c 8s 5s +60%
sign_pk_from_sk 8s 4s +100%
compute_pack_t0_t1 7s 6s +17%
keccak_absorb 7s 9s -22%
pointwise_acc_native_x86_64 7s 7s +0%
poly_caddq_c 7s 5s +40%
poly_power2round 7s 8s -12%
rej_uniform 7s 9s -22%
mld_compute_pack_z 6s 5s +20%
mld_keccakf1600_permute_c 6s 7s -14%
pack_sk_rho_key_tr_s2 6s 3s +100%
polyt0_pack 6s 3s +100%
polyw1_pack_32 6s 4s +50%
sign 6s 7s -14%
sign_open 6s 4s +50%
sign_signature_extmu 6s 2s +200%
sign_signature_pre_hash_shake256 6s 5s +20%
yvec_get_poly 6s 2s +200%
intt_native_aarch64 5s 4s +25%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 5s 3s +67%
mld_prepare_domain_separation_prefix 5s 5s +0%
montgomery_reduce 5s 2s +150%
poly_decompose_c 5s 4s +25%
poly_shiftl 5s 3s +67%
poly_uniform_gamma1_4x 5s 3s +67%
poly_use_hint_native_aarch64 5s 2s +150%
polyeta_unpack 5s 5s +0%
polyveck_reduce 5s 6s -17%
polyz_unpack_native_x86_64 5s 3s +67%
rej_uniform_native_aarch64 5s 4s +25%
shake256_squeeze 5s 2s +150%
sign_keypair 5s 5s +0%
sign_signature_internal 5s 5s +0%
sign_signature_pre_hash_internal 5s 7s -29%
yvec_init 5s 7s -29%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 4s 1s +300%
keccakf1600x4_extract_bytes 4s 4s +0%
make_hint 4s 4s +0%
mld_ct_get_optblocker_u32 4s 2s +100%
mld_sample_s1_s2_serial 4s 4s +0%
mld_value_barrier_u8 4s 2s +100%
pack_sk_s1 4s 2s +100%
pointwise_native_x86_64 4s 3s +33%
poly_caddq_native_aarch64 4s 2s +100%
poly_challenge 4s 5s -20%
poly_chknorm 4s 3s +33%
poly_invntt_tomont 4s 3s +33%
poly_invntt_tomont_native 4s 2s +100%
poly_reduce 4s 5s -20%
poly_uniform 4s 5s -20%
poly_uniform_eta 4s 4s +0%
polyvecl_pointwise_acc_montgomery 4s 4s +0%
polyvecl_pointwise_acc_montgomery_c 4s 4s +0%
polyvecl_unpack_eta 4s 2s +100%
polyz_unpack_17_native_aarch64 4s 2s +100%
polyz_unpack_native 4s 2s +100%
rej_eta_native 4s 3s +33%
shake256_finalize 4s 1s +300%
shake256_init 4s 3s +33%
shake256x4_absorb_once 4s 5s -20%
sig_unpack_hints 4s 3s +33%
sign_keypair_internal 4s 7s -43%
sign_signature 4s 3s +33%
sign_verify 4s 5s -20%
sign_verify_extmu 4s 2s +100%
unpack_sk_s2hat 4s 4s +0%
caddq 3s 3s +0%
decompose 3s 4s -25%
intt_native_x86_64 3s 3s +0%
keccak_finalize 3s 3s +0%
keccak_squeeze 3s 4s -25%
keccak_squeezeblocks_x4 3s 4s -25%
keccakf1600_xor_bytes 3s 1s +200%
keccakf1600x4_xor_bytes 3s 2s +50%
mld_ct_cmask_neg_i32 3s 3s +0%
mld_keccakf1600x4_extract_bytes_c 3s 2s +50%
mld_sample_s1_s2 3s 7s -57%
mld_value_barrier_i64 3s 3s +0%
ntt_native_aarch64 3s 3s +0%
pack_sig_c 3s 4s -25%
pack_sig_h 3s 2s +50%
pack_sig_z 3s 1s +200%
poly_caddq_native_x86_64 3s 2s +50%
poly_chknorm_native_aarch64 3s 4s -25%
poly_chknorm_native_x86_64 3s 4s -25%
poly_decompose 3s 3s +0%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_88_native_aarch64 3s 2s +50%
poly_decompose_native 3s 2s +50%
poly_ntt 3s 3s +0%
poly_ntt_c 3s 3s +0%
poly_ntt_native 3s 1s +200%
poly_permute_bitrev_to_custom_optional 3s 1s +200%
poly_pointwise_montgomery 3s 3s +0%
poly_sub 3s 3s +0%
poly_uniform_gamma1 3s 3s +0%
poly_use_hint 3s 2s +50%
poly_use_hint_c 3s 2s +50%
poly_use_hint_native 3s 4s -25%
polyt1_pack 3s 1s +200%
polyt1_unpack 3s 3s +0%
polyveck_unpack_eta 3s 3s +0%
polyvecl_pack_eta 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 4s -25%
polyvecl_unpack_z 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_pack 3s 3s +0%
polyz_unpack 3s 4s -25%
polyz_unpack_19_native_aarch64 3s 4s -25%
power2round 3s 2s +50%
reduce32 3s 3s +0%
rej_eta_c 3s 5s -40%
rej_uniform_eta_native_aarch64 3s 3s +0%
shake128_absorb 3s 1s +200%
shake128_finalize 3s 2s +50%
shake128_release 3s 3s +0%
shake128_squeeze 3s 2s +50%
shake256_absorb 3s 1s +200%
shake256_release 3s 2s +50%
sk_s2hat_get_poly 3s 4s -25%
unpack_sk 3s 3s +0%
unpack_sk_s1hat 3s 2s +50%
use_hint 3s 5s -40%
fqscale 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccak_init 2s 2s +0%
keccakf1600_extract_bytes (big endian) 2s 3s -33%
keccakf1600_permute 2s 2s +0%
keccakf1600_permute_native 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 4s -50%
keccakf1600x4_xor_bytes_native 2s 2s +0%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 5s -60%
mld_ct_get_optblocker_i64 2s 1s +100%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 3s -33%
mld_h 2s 4s -50%
mld_keccakf1600x4_xor_bytes_c 2s 2s +0%
mld_polymat_expand_entry 2s 3s -33%
mld_value_barrier_u32 2s 3s -33%
ntt_native_x86_64 2s 4s -50%
nttunpack_native_x86_64 2s 5s -60%
pointwise_native_aarch64 2s 4s -50%
poly_caddq 2s 4s -50%
poly_caddq_native 2s 3s -33%
poly_chknorm_native 2s 2s +0%
poly_permute_bitrev_to_custom_optional_native 2s 5s -60%
polyeta_pack 2s 3s -33%
polyvec_matrix_expand 2s 3s -33%
polyvec_matrix_expand_serial 2s 2s +0%
polyveck_ntt 2s 3s -33%
polyveck_pack_w1 2s 3s -33%
polyvecl_pointwise_acc_montgomery_native 2s 4s -50%
polyvecl_uniform_gamma1 2s 4s -50%
shake128_init 2s 2s +0%
shake128x4_squeezeblocks 2s 3s -33%
shake256 2s 3s -33%
shake256x4_squeezeblocks 2s 3s -33%
sign_verify_pre_hash_shake256 2s 4s -50%
sk_s1hat_get_poly 2s 1s +100%
sys_check_capability 2s 2s +0%
unpack_pk_t1 2s 5s -60%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccak_f1600_x4_native_aarch64_v84a 1s 3s -67%
keccakf1600_xor_bytes (big endian) 1s 3s -67%
keccakf1600x4_permute 1s 3s -67%
mld_ct_cmask_nonzero_u32 1s 4s -75%
mld_keccakf1600_extract_bytes 1s 2s -50%
poly_pointwise_montgomery_native 1s 5s -80%
poly_uniform_4x 1s 3s -67%
polyveck_pack_eta 1s 4s -75%
polyw1_pack_88 1s 3s -67%
rej_eta 1s 4s -75%
shake128x4_absorb_once 1s 2s -50%
sk_t0hat_get_poly 1s 5s -80%
unpack_sk_t0hat 1s 3s -67%

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-44, REDUCE-RAM)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 1550s 1628s -4.8%
mld_invntt_layer 168s 177s -5%
poly_pointwise_montgomery_c 128s 142s -10%
rej_uniform_native 126s 127s -1%
polyvec_matrix_pointwise_montgomery_yvec 113s 122s -7%
mld_ct_memcmp 66s 69s -4%
fqmul 43s 42s +2%
mld_ntt_layer 42s 47s -11%
mld_ntt_butterfly_block 26s 22s +18%
mld_attempt_signature_generation 24s 27s -11%
sign_verify_internal 23s 24s -4%
poly_chknorm_c 22s 21s +5%
keccakf1600x4_permute_native 21s 22s -5%
mld_check_pct 15s 14s +7%
polyt0_unpack 15s 18s -17%
rej_uniform_c 15s 17s -12%
polyeta_unpack 14s 15s -7%
poly_uniform_eta_4x 13s 15s -13%
poly_add 11s 12s -8%
polyz_unpack_c 11s 10s +10%
compute_pack_t0_t1 10s 9s +11%
mld_compute_pack_z 9s 5s +80%
poly_invntt_tomont_c 9s 9s +0%
keccak_absorb_once_x4 8s 10s -20%
poly_decompose_c 8s 8s +0%
poly_power2round 8s 6s +33%
polyveck_chknorm 8s 7s +14%
polyvecl_chknorm 8s 6s +33%
rej_uniform 8s 8s +0%
sign_pk_from_sk 8s 6s +33%
mld_keccakf1600_permute_c 7s 8s -12%
pointwise_acc_native_aarch64 7s 5s +40%
poly_caddq_native_aarch64 7s 5s +40%
poly_chknorm_native 7s 4s +75%
polyvec_matrix_pointwise_montgomery_row 7s 10s -30%
sign 7s 6s +17%
keccak_absorb 6s 8s -25%
mld_sample_s1_s2_serial 6s 3s +100%
poly_reduce 6s 3s +100%
poly_use_hint_native 6s 3s +100%
polyveck_decompose 6s 7s -14%
polyveck_invntt_tomont 6s 5s +20%
polyveck_ntt 6s 2s +200%
shake256_release 6s 3s +100%
mld_prepare_domain_separation_prefix 5s 4s +25%
mld_sample_s1_s2 5s 5s +0%
ntt_native_x86_64 5s 3s +67%
pointwise_acc_native_x86_64 5s 6s -17%
pointwise_native_x86_64 5s 3s +67%
poly_challenge 5s 4s +25%
poly_chknorm 5s 2s +150%
poly_uniform 5s 5s +0%
polyveck_caddq 5s 6s -17%
reduce32 5s 3s +67%
rej_eta_native 5s 4s +25%
sign_open 5s 4s +25%
sign_signature_extmu 5s 2s +150%
sign_verify_extmu 5s 4s +25%
sign_verify_pre_hash_internal 5s 5s +0%
intt_native_aarch64 4s 4s +0%
intt_native_x86_64 4s 2s +100%
keccak_squeezeblocks_x4 4s 5s -20%
mld_ct_get_optblocker_u8 4s 4s +0%
pack_sig_h 4s 4s +0%
pointwise_native_aarch64 4s 6s -33%
poly_caddq_native 4s 4s +0%
poly_caddq_native_x86_64 4s 2s +100%
poly_decompose 4s 5s -20%
poly_permute_bitrev_to_custom_optional_native 4s 3s +33%
poly_pointwise_montgomery_native 4s 3s +33%
poly_shiftl 4s 4s +0%
poly_uniform_4x 4s 2s +100%
poly_use_hint_native_aarch64 4s 2s +100%
polyt0_pack 4s 3s +33%
polyt1_pack 4s 3s +33%
polyvec_matrix_expand 4s 1s +300%
polyveck_pack_eta 4s 4s +0%
polyvecl_uniform_gamma1 4s 3s +33%
polyvecl_uniform_gamma1_serial 4s 4s +0%
polyvecl_unpack_eta 4s 2s +100%
polyw1_pack_88 4s 3s +33%
polyz_unpack_native 4s 4s +0%
polyz_unpack_native_x86_64 4s 3s +33%
power2round 4s 5s -20%
rej_uniform_native_aarch64 4s 3s +33%
sig_unpack_hints 4s 3s +33%
sign_keypair_internal 4s 5s -20%
sign_signature 4s 6s -33%
sign_signature_pre_hash_shake256 4s 5s -20%
sign_verify 4s 4s +0%
sk_s2hat_get_poly 4s 3s +33%
sk_t0hat_get_poly 4s 3s +33%
unpack_sk_s2hat 4s 2s +100%
caddq 3s 1s +200%
decompose 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccakf1600x4_extract_bytes_native 3s 3s +0%
keccakf1600x4_permute 3s 2s +50%
keccakf1600x4_xor_bytes 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_h 3s 5s -40%
mld_keccakf1600_extract_bytes 3s 4s -25%
mld_polymat_expand_entry 3s 3s +0%
pack_sk_rho_key_tr_s2 3s 6s -50%
pack_sk_s1 3s 5s -40%
poly_caddq 3s 5s -40%
poly_caddq_c 3s 3s +0%
poly_chknorm_native_aarch64 3s 4s -25%
poly_chknorm_native_x86_64 3s 2s +50%
poly_decompose_88_native_aarch64 3s 4s -25%
poly_invntt_tomont 3s 4s -25%
poly_invntt_tomont_native 3s 3s +0%
poly_ntt 3s 1s +200%
poly_ntt_c 3s 4s -25%
poly_permute_bitrev_to_custom_optional 3s 3s +0%
poly_uniform_eta 3s 8s -62%
poly_uniform_gamma1_4x 3s 2s +50%
poly_use_hint 3s 3s +0%
poly_use_hint_c 3s 1s +200%
polyeta_pack 3s 3s +0%
polyt1_unpack 3s 5s -40%
polyvec_matrix_expand_serial 3s 2s +50%
polyveck_pack_w1 3s 3s +0%
polyveck_reduce 3s 7s -57%
polyveck_unpack_eta 3s 3s +0%
polyvecl_ntt 3s 5s -40%
polyvecl_pack_eta 3s 2s +50%
polyvecl_pointwise_acc_montgomery_native 3s 5s -40%
polyvecl_unpack_z 3s 3s +0%
polyz_unpack_19_native_aarch64 3s 4s -25%
rej_eta 3s 4s -25%
shake128_finalize 3s 3s +0%
shake128_squeeze 3s 4s -25%
shake128x4_squeezeblocks 3s 1s +200%
shake256_squeeze 3s 3s +0%
shake256x4_squeezeblocks 3s 3s +0%
sign_keypair 3s 3s +0%
sign_signature_internal 3s 8s -62%
sign_signature_pre_hash_internal 3s 5s -40%
sign_verify_pre_hash_shake256 3s 3s +0%
sys_check_capability 3s 3s +0%
unpack_sk 3s 2s +50%
unpack_sk_s1hat 3s 4s -25%
unpack_sk_t0hat 3s 3s +0%
yvec_get_poly 3s 5s -40%
fqscale 2s 5s -60%
keccak_f1600_x1_native_aarch64 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 5s -60%
keccak_f1600_x4_native_avx2 2s 4s -50%
keccak_finalize 2s 2s +0%
keccak_init 2s 3s -33%
keccakf1600_permute 2s 2s +0%
keccakf1600_permute_native 2s 4s -50%
keccakf1600_xor_bytes 2s 4s -50%
keccakf1600_xor_bytes (big endian) 2s 2s +0%
keccakf1600x4_extract_bytes 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 3s -33%
make_hint 2s 4s -50%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_sel_int32 2s 2s +0%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_keccakf1600x4_xor_bytes_c 2s 2s +0%
mld_value_barrier_i64 2s 4s -50%
mld_value_barrier_u32 2s 1s +100%
mld_value_barrier_u8 2s 2s +0%
ntt_native_aarch64 2s 4s -50%
nttunpack_native_x86_64 2s 2s +0%
pack_sig_c 2s 3s -33%
pack_sig_z 2s 3s -33%
poly_decompose_native 2s 5s -60%
poly_ntt_native 2s 4s -50%
poly_pointwise_montgomery 2s 3s -33%
poly_sub 2s 4s -50%
poly_uniform_gamma1 2s 2s +0%
polyvecl_pointwise_acc_montgomery 2s 1s +100%
polyvecl_pointwise_acc_montgomery_c 2s 4s -50%
polyw1_pack 2s 3s -33%
polyw1_pack_32 2s 2s +0%
polyz_pack 2s 5s -60%
polyz_unpack 2s 3s -33%
polyz_unpack_17_native_aarch64 2s 4s -50%
rej_eta_c 2s 4s -50%
rej_uniform_eta_native_aarch64 2s 3s -33%
shake128_absorb 2s 2s +0%
shake128x4_absorb_once 2s 3s -33%
shake256 2s 3s -33%
shake256_init 2s 1s +100%
shake256x4_absorb_once 2s 1s +100%
sk_s1hat_get_poly 2s 2s +0%
unpack_pk_t1 2s 3s -33%
yvec_init 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 1s 4s -75%
keccak_squeeze 1s 4s -75%
keccakf1600_extract_bytes (big endian) 1s 2s -50%
mld_ct_abs_i32 1s 3s -67%
montgomery_reduce 1s 2s -50%
poly_decompose_32_native_aarch64 1s 3s -67%
shake128_init 1s 2s -50%
shake128_release 1s 4s -75%
shake256_absorb 1s 3s -67%
shake256_finalize 1s 4s -75%
use_hint 1s 3s -67%

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-87, REDUCE-RAM)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 1601s 1624s -1.4%
mld_invntt_layer 177s 173s +2%
polyvec_matrix_pointwise_montgomery_yvec 155s 162s -4%
poly_pointwise_montgomery_c 137s 138s -1%
rej_uniform_native 127s 124s +2%
mld_ct_memcmp 68s 71s -4%
fqmul 42s 43s -2%
mld_ntt_layer 41s 46s -11%
mld_attempt_signature_generation 35s 34s +3%
keccakf1600x4_permute_native 23s 21s +10%
mld_ntt_butterfly_block 22s 20s +10%
sign_verify_internal 21s 21s +0%
poly_chknorm_c 19s 20s -5%
polyveck_decompose 17s 17s +0%
polyt0_unpack 16s 17s -6%
polyeta_unpack 15s 14s +7%
poly_uniform_eta_4x 14s 14s +0%
rej_uniform_c 14s 16s -12%
mld_check_pct 12s 14s -14%
polyvecl_chknorm 11s 11s +0%
compute_pack_t0_t1 10s 11s -9%
keccak_absorb 10s 6s +67%
poly_add 10s 12s -17%
keccak_absorb_once_x4 9s 7s +29%
polyvecl_ntt 9s 7s +29%
poly_invntt_tomont_c 8s 8s +0%
sign 8s 6s +33%
mld_keccakf1600_permute_c 7s 6s +17%
polyvec_matrix_pointwise_montgomery_row 7s 8s -12%
polyveck_caddq 7s 6s +17%
rej_uniform 7s 9s -22%
mld_compute_pack_z 6s 5s +20%
pointwise_acc_native_aarch64 6s 7s -14%
pointwise_acc_native_x86_64 6s 6s +0%
poly_pointwise_montgomery_native 6s 2s +200%
poly_shiftl 6s 3s +100%
polyveck_invntt_tomont 6s 6s +0%
sign_open 6s 6s +0%
sign_signature_pre_hash_internal 6s 2s +200%
intt_native_x86_64 5s 4s +25%
mld_ct_get_optblocker_u32 5s 3s +67%
mld_sample_s1_s2_serial 5s 4s +25%
nttunpack_native_x86_64 5s 2s +150%
poly_caddq_c 5s 3s +67%
poly_chknorm_native 5s 4s +25%
poly_ntt 5s 2s +150%
poly_ntt_c 5s 3s +67%
poly_power2round 5s 4s +25%
poly_uniform 5s 4s +25%
polyveck_chknorm 5s 2s +150%
polyveck_reduce 5s 6s -17%
polyz_unpack_c 5s 5s +0%
polyz_unpack_native 5s 5s +0%
sign_keypair 5s 6s -17%
sign_keypair_internal 5s 5s +0%
sign_pk_from_sk 5s 5s +0%
sign_signature 5s 3s +67%
sign_verify 5s 4s +25%
unpack_sk 5s 4s +25%
unpack_sk_t0hat 5s 2s +150%
decompose 4s 2s +100%
keccak_squeeze 4s 2s +100%
keccakf1600x4_xor_bytes 4s 2s +100%
mld_h 4s 4s +0%
mld_keccakf1600_extract_bytes 4s 4s +0%
mld_sample_s1_s2 4s 7s -43%
pack_sk_s1 4s 2s +100%
poly_caddq 4s 2s +100%
poly_caddq_native 4s 2s +100%
poly_caddq_native_aarch64 4s 1s +300%
poly_challenge 4s 4s +0%
poly_chknorm 4s 5s -20%
poly_decompose 4s 2s +100%
poly_permute_bitrev_to_custom_optional_native 4s 3s +33%
poly_reduce 4s 5s -20%
poly_uniform_eta 4s 4s +0%
poly_uniform_gamma1 4s 4s +0%
poly_use_hint_native_aarch64 4s 2s +100%
polyt0_pack 4s 5s -20%
polyt1_unpack 4s 6s -33%
polyvec_matrix_expand 4s 1s +300%
polyvec_matrix_expand_serial 4s 2s +100%
polyveck_ntt 4s 3s +33%
polyveck_unpack_eta 4s 4s +0%
polyvecl_uniform_gamma1_serial 4s 2s +100%
polyvecl_unpack_eta 4s 2s +100%
polyvecl_unpack_z 4s 5s -20%
polyw1_pack_88 4s 3s +33%
polyz_pack 4s 4s +0%
reduce32 4s 4s +0%
rej_eta_native 4s 4s +0%
rej_uniform_eta_native_aarch64 4s 2s +100%
sig_unpack_hints 4s 2s +100%
sk_t0hat_get_poly 4s 8s -50%
use_hint 4s 2s +100%
intt_native_aarch64 3s 4s -25%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_squeezeblocks_x4 3s 5s -40%
keccakf1600_permute_native 3s 2s +50%
keccakf1600_xor_bytes 3s 1s +200%
keccakf1600x4_extract_bytes 3s 3s +0%
mld_ct_cmask_nonzero_u8 3s 5s -40%
mld_ct_sel_int32 3s 1s +200%
mld_prepare_domain_separation_prefix 3s 5s -40%
montgomery_reduce 3s 4s -25%
ntt_native_x86_64 3s 1s +200%
pointwise_native_aarch64 3s 2s +50%
pointwise_native_x86_64 3s 3s +0%
poly_caddq_native_x86_64 3s 4s -25%
poly_chknorm_native_x86_64 3s 3s +0%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_c 3s 4s -25%
poly_ntt_native 3s 3s +0%
poly_permute_bitrev_to_custom_optional 3s 2s +50%
poly_pointwise_montgomery 3s 3s +0%
poly_uniform_4x 3s 3s +0%
poly_use_hint 3s 4s -25%
poly_use_hint_c 3s 2s +50%
poly_use_hint_native 3s 2s +50%
polyeta_pack 3s 2s +50%
polyveck_pack_eta 3s 4s -25%
polyvecl_pointwise_acc_montgomery_c 3s 4s -25%
polyvecl_pointwise_acc_montgomery_native 3s 4s -25%
polyvecl_uniform_gamma1 3s 2s +50%
polyw1_pack 3s 4s -25%
polyz_unpack_19_native_aarch64 3s 3s +0%
polyz_unpack_native_x86_64 3s 4s -25%
power2round 3s 3s +0%
shake128_release 3s 3s +0%
shake256x4_absorb_once 3s 2s +50%
sign_signature_extmu 3s 4s -25%
sign_signature_internal 3s 6s -50%
sign_signature_pre_hash_shake256 3s 7s -57%
sign_verify_pre_hash_internal 3s 2s +50%
sign_verify_pre_hash_shake256 3s 2s +50%
unpack_pk_t1 3s 4s -25%
unpack_sk_s1hat 3s 4s -25%
yvec_init 3s 2s +50%
caddq 2s 3s -33%
fqscale 2s 5s -60%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_f1600_x4_native_avx2 2s 4s -50%
keccak_init 2s 3s -33%
keccakf1600_extract_bytes (big endian) 2s 5s -60%
keccakf1600_permute 2s 2s +0%
keccakf1600_xor_bytes (big endian) 2s 2s +0%
keccakf1600x4_extract_bytes_native 2s 2s +0%
keccakf1600x4_permute 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 2s +0%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 2s +0%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_keccakf1600x4_xor_bytes_c 2s 5s -60%
mld_polymat_expand_entry 2s 3s -33%
mld_value_barrier_i64 2s 2s +0%
ntt_native_aarch64 2s 4s -50%
pack_sig_c 2s 2s +0%
pack_sig_h 2s 4s -50%
pack_sk_rho_key_tr_s2 2s 4s -50%
poly_chknorm_native_aarch64 2s 3s -33%
poly_decompose_88_native_aarch64 2s 2s +0%
poly_decompose_native 2s 4s -50%
poly_invntt_tomont 2s 2s +0%
poly_invntt_tomont_native 2s 4s -50%
poly_uniform_gamma1_4x 2s 2s +0%
polyt1_pack 2s 4s -50%
polyveck_pack_w1 2s 4s -50%
polyvecl_pack_eta 2s 4s -50%
polyvecl_pointwise_acc_montgomery 2s 4s -50%
polyw1_pack_32 2s 4s -50%
polyz_unpack 2s 4s -50%
polyz_unpack_17_native_aarch64 2s 2s +0%
rej_eta_c 2s 3s -33%
rej_uniform_native_aarch64 2s 5s -60%
shake128_absorb 2s 2s +0%
shake128_init 2s 3s -33%
shake128x4_absorb_once 2s 2s +0%
shake128x4_squeezeblocks 2s 3s -33%
shake256_absorb 2s 1s +100%
shake256_finalize 2s 2s +0%
shake256_release 2s 2s +0%
shake256x4_squeezeblocks 2s 2s +0%
sign_verify_extmu 2s 4s -50%
sk_s1hat_get_poly 2s 2s +0%
sk_s2hat_get_poly 2s 1s +100%
sys_check_capability 2s 3s -33%
unpack_sk_s2hat 2s 3s -33%
yvec_get_poly 2s 1s +100%
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_finalize 1s 2s -50%
make_hint 1s 5s -80%
mld_ct_get_optblocker_i64 1s 1s +0%
mld_ct_get_optblocker_u8 1s 2s -50%
mld_value_barrier_u32 1s 4s -75%
mld_value_barrier_u8 1s 3s -67%
pack_sig_z 1s 5s -80%
poly_sub 1s 3s -67%
rej_eta 1s 2s -50%
shake128_finalize 1s 3s -67%
shake128_squeeze 1s 2s -50%
shake256 1s 1s +0%
shake256_init 1s 4s -75%
shake256_squeeze 1s 2s -50%

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-44)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 1771s 1727s +2.5%
mld_invntt_layer 283s 273s +4%
rej_uniform_native 144s 140s +3%
polyvecl_pointwise_acc_montgomery_c 121s 117s +3%
poly_pointwise_montgomery_c 95s 97s -2%
mld_ct_memcmp 67s 70s -4%
mld_attempt_signature_generation 63s 64s -2%
mld_ntt_layer 44s 43s +2%
fqmul 41s 40s +2%
polyvec_matrix_expand 27s 24s +12%
sign_verify_internal 26s 29s -10%
mld_ntt_butterfly_block 24s 20s +20%
keccakf1600x4_permute_native 22s 24s -8%
rej_uniform 21s 19s +11%
sign_signature_internal 21s 19s +11%
poly_chknorm_c 19s 20s -5%
mld_check_pct 18s 15s +20%
polyeta_unpack 18s 14s +29%
polyt0_unpack 16s 18s -11%
compute_pack_t0_t1 13s 15s -13%
rej_uniform_c 13s 13s +0%
poly_uniform_4x 12s 11s +9%
poly_uniform_eta_4x 11s 12s -8%
polyz_unpack_c 11s 10s +10%
poly_add 10s 11s -9%
poly_invntt_tomont_c 10s 7s +43%
polyvec_matrix_pointwise_montgomery_yvec 10s 8s +25%
polyveck_chknorm 10s 12s -17%
polyveck_decompose 10s 9s +11%
keccak_absorb_once_x4 8s 10s -20%
mld_compute_pack_z 8s 7s +14%
mld_keccakf1600_permute_c 8s 7s +14%
poly_caddq_c 8s 8s +0%
sign_open 8s 4s +100%
polyvec_matrix_expand_serial 7s 7s +0%
polyz_unpack 7s 4s +75%
sign 7s 10s -30%
keccak_squeezeblocks_x4 6s 2s +200%
pointwise_acc_native_x86_64 6s 7s -14%
poly_caddq_native 6s 5s +20%
poly_decompose_c 6s 4s +50%
poly_uniform_gamma1_4x 6s 3s +100%
polyvecl_pointwise_acc_montgomery_native 6s 6s +0%
shake256x4_absorb_once 6s 2s +200%
intt_native_x86_64 5s 3s +67%
keccak_absorb 5s 8s -38%
mld_prepare_domain_separation_prefix 5s 3s +67%
mld_sample_s1_s2_serial 5s 4s +25%
pointwise_acc_native_aarch64 5s 5s +0%
poly_challenge 5s 5s +0%
poly_ntt_native 5s 5s +0%
poly_permute_bitrev_to_custom_optional_native 5s 4s +25%
poly_power2round 5s 4s +25%
poly_uniform 5s 4s +25%
poly_use_hint_c 5s 5s +0%
polyvecl_chknorm 5s 1s +400%
polyz_unpack_native_x86_64 5s 1s +400%
rej_uniform_native_aarch64 5s 3s +67%
shake256_init 5s 3s +67%
sign_signature_extmu 5s 3s +67%
sign_verify_extmu 5s 5s +0%
sign_verify_pre_hash_shake256 5s 5s +0%
unpack_pk_t1 5s 4s +25%
unpack_sk_t0hat 5s 3s +67%
intt_native_aarch64 4s 2s +100%
keccak_f1600_x1_native_aarch64_v84a 4s 4s +0%
mld_keccakf1600_extract_bytes 4s 3s +33%
mld_polymat_expand_entry 4s 2s +100%
mld_value_barrier_u8 4s 4s +0%
pack_sig_c 4s 4s +0%
pack_sk_rho_key_tr_s2 4s 3s +33%
pointwise_native_aarch64 4s 5s -20%
poly_chknorm_native_aarch64 4s 5s -20%
poly_decompose_32_native_aarch64 4s 2s +100%
poly_permute_bitrev_to_custom_optional 4s 3s +33%
poly_reduce 4s 4s +0%
poly_sub 4s 2s +100%
poly_uniform_eta 4s 2s +100%
polyeta_pack 4s 3s +33%
polyt0_pack 4s 5s -20%
polyveck_pack_w1 4s 4s +0%
polyvecl_ntt 4s 4s +0%
polyvecl_pack_eta 4s 2s +100%
polyvecl_unpack_z 4s 3s +33%
polyw1_pack_32 4s 2s +100%
polyz_pack 4s 3s +33%
power2round 4s 3s +33%
shake128_absorb 4s 1s +300%
shake128x4_squeezeblocks 4s 1s +300%
sig_unpack_hints 4s 2s +100%
sign_keypair 4s 6s -33%
sign_keypair_internal 4s 3s +33%
sign_pk_from_sk 4s 6s -33%
sign_signature 4s 4s +0%
sign_signature_pre_hash_internal 4s 2s +100%
sign_signature_pre_hash_shake256 4s 4s +0%
sign_verify_pre_hash_internal 4s 4s +0%
caddq 3s 5s -40%
decompose 3s 2s +50%
fqscale 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
keccak_init 3s 1s +200%
keccakf1600_extract_bytes (big endian) 3s 1s +200%
keccakf1600x4_extract_bytes 3s 3s +0%
keccakf1600x4_extract_bytes_native 3s 2s +50%
make_hint 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_get_optblocker_u32 3s 2s +50%
mld_ct_get_optblocker_u8 3s 2s +50%
mld_ct_sel_int32 3s 3s +0%
mld_h 3s 5s -40%
mld_keccakf1600x4_extract_bytes_c 3s 1s +200%
mld_keccakf1600x4_xor_bytes_c 3s 2s +50%
mld_sample_s1_s2 3s 3s +0%
mld_value_barrier_u32 3s 1s +200%
ntt_native_x86_64 3s 4s -25%
nttunpack_native_x86_64 3s 3s +0%
pack_sig_h 3s 5s -40%
pointwise_native_x86_64 3s 2s +50%
poly_caddq_native_aarch64 3s 2s +50%
poly_caddq_native_x86_64 3s 2s +50%
poly_chknorm 3s 2s +50%
poly_invntt_tomont 3s 1s +200%
poly_ntt 3s 1s +200%
poly_pointwise_montgomery 3s 3s +0%
poly_pointwise_montgomery_native 3s 3s +0%
poly_shiftl 3s 3s +0%
poly_use_hint 3s 4s -25%
polyvec_matrix_pointwise_montgomery_row 3s 3s +0%
polyveck_caddq 3s 2s +50%
polyveck_invntt_tomont 3s 6s -50%
polyveck_ntt 3s 6s -50%
polyveck_reduce 3s 2s +50%
polyveck_unpack_eta 3s 4s -25%
polyvecl_unpack_eta 3s 2s +50%
polyw1_pack 3s 2s +50%
polyz_unpack_17_native_aarch64 3s 2s +50%
polyz_unpack_19_native_aarch64 3s 2s +50%
polyz_unpack_native 3s 3s +0%
rej_eta_native 3s 5s -40%
rej_uniform_eta_native_aarch64 3s 2s +50%
shake128_squeeze 3s 1s +200%
shake256 3s 2s +50%
shake256x4_squeezeblocks 3s 3s +0%
sign_verify 3s 2s +50%
sk_t0hat_get_poly 3s 5s -40%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccak_finalize 2s 2s +0%
keccak_squeeze 2s 3s -33%
keccakf1600_permute 2s 1s +100%
keccakf1600_permute_native 2s 3s -33%
keccakf1600_xor_bytes 2s 2s +0%
keccakf1600x4_permute 2s 3s -33%
keccakf1600x4_xor_bytes 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 2s +0%
mld_ct_abs_i32 2s 1s +100%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_get_optblocker_i64 2s 2s +0%
montgomery_reduce 2s 3s -33%
pack_sig_z 2s 2s +0%
pack_sk_s1 2s 3s -33%
poly_caddq 2s 4s -50%
poly_chknorm_native_x86_64 2s 3s -33%
poly_decompose 2s 4s -50%
poly_decompose_88_native_aarch64 2s 3s -33%
poly_decompose_native 2s 2s +0%
poly_invntt_tomont_native 2s 3s -33%
poly_ntt_c 2s 2s +0%
poly_use_hint_native 2s 4s -50%
poly_use_hint_native_aarch64 2s 5s -60%
polyt1_pack 2s 3s -33%
polyt1_unpack 2s 3s -33%
polyveck_pack_eta 2s 3s -33%
polyvecl_pointwise_acc_montgomery 2s 2s +0%
polyvecl_uniform_gamma1 2s 1s +100%
polyvecl_uniform_gamma1_serial 2s 5s -60%
polyw1_pack_88 2s 3s -33%
reduce32 2s 3s -33%
rej_eta 2s 3s -33%
shake128_finalize 2s 4s -50%
shake128_init 2s 2s +0%
shake128x4_absorb_once 2s 6s -67%
shake256_absorb 2s 4s -50%
shake256_finalize 2s 2s +0%
shake256_release 2s 1s +100%
shake256_squeeze 2s 3s -33%
sk_s1hat_get_poly 2s 5s -60%
sk_s2hat_get_poly 2s 3s -33%
sys_check_capability 2s 2s +0%
unpack_sk_s2hat 2s 4s -50%
use_hint 2s 5s -60%
yvec_get_poly 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 3s -67%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 3s -67%
keccakf1600_xor_bytes (big endian) 1s 3s -67%
mld_value_barrier_i64 1s 2s -50%
ntt_native_aarch64 1s 4s -75%
poly_chknorm_native 1s 3s -67%
poly_uniform_gamma1 1s 2s -50%
rej_eta_c 1s 4s -75%
shake128_release 1s 5s -80%
unpack_sk 1s 2s -50%
unpack_sk_s1hat 1s 3s -67%
yvec_init 1s 2s -50%

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-87)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 2363s 2349s +0.6%
polyvecl_pointwise_acc_montgomery_c 316s 296s +7%
mld_invntt_layer 282s 297s -5%
polyvec_matrix_expand 214s 213s +0%
rej_uniform_native 145s 149s -3%
mld_attempt_signature_generation 104s 105s -1%
poly_pointwise_montgomery_c 88s 99s -11%
mld_ct_memcmp 70s 64s +9%
sign_signature_internal 66s 64s +3%
sign_verify_internal 56s 57s -2%
polyvec_matrix_expand_serial 49s 48s +2%
mld_ntt_layer 41s 46s -11%
fqmul 40s 41s -2%
compute_pack_t0_t1 33s 32s +3%
polyvec_matrix_pointwise_montgomery_yvec 32s 32s +0%
keccakf1600x4_permute_native 23s 24s -4%
mld_ntt_butterfly_block 21s 19s +11%
rej_uniform 21s 21s +0%
poly_chknorm_c 19s 19s +0%
mld_check_pct 15s 16s -6%
polyeta_unpack 15s 16s -6%
polyt0_unpack 15s 16s -6%
poly_uniform_eta_4x 13s 12s +8%
rej_uniform_c 13s 12s +8%
poly_invntt_tomont_c 12s 7s +71%
poly_uniform_4x 12s 12s +0%
poly_add 10s 11s -9%
polyveck_decompose 10s 8s +25%
keccak_absorb_once_x4 9s 9s +0%
pointwise_acc_native_x86_64 9s 7s +29%
polyveck_invntt_tomont 9s 7s +29%
polyvecl_ntt 8s 10s -20%
mld_keccakf1600_permute_c 7s 6s +17%
poly_ntt_native 7s 7s +0%
sign_verify_pre_hash_shake256 7s 5s +40%
mld_compute_pack_z 6s 6s +0%
mld_sample_s1_s2 6s 5s +20%
pack_sk_s1 6s 4s +50%
pointwise_acc_native_aarch64 6s 8s -25%
poly_chknorm_native_x86_64 6s 2s +200%
poly_decompose_c 6s 7s -14%
poly_power2round 6s 4s +50%
polyveck_caddq 6s 7s -14%
polyveck_ntt 6s 7s -14%
power2round 6s 4s +50%
sign 6s 8s -25%
sign_keypair 6s 7s -14%
sign_pk_from_sk 6s 7s -14%
sign_verify_extmu 6s 4s +50%
sign_verify_pre_hash_internal 6s 5s +20%
unpack_sk_t0hat 6s 5s +20%
fqscale 5s 3s +67%
keccak_absorb 5s 8s -38%
keccak_squeezeblocks_x4 5s 2s +150%
keccakf1600_xor_bytes 5s 5s +0%
mld_ct_cmask_nonzero_u8 5s 2s +150%
mld_sample_s1_s2_serial 5s 6s -17%
ntt_native_aarch64 5s 2s +150%
poly_caddq_native 5s 4s +25%
poly_use_hint_c 5s 5s +0%
polyveck_unpack_eta 5s 2s +150%
polyvecl_pack_eta 5s 2s +150%
polyz_unpack_c 5s 3s +67%
rej_eta_native 5s 3s +67%
sign_keypair_internal 5s 2s +150%
sign_open 5s 4s +25%
sign_signature 5s 5s +0%
sign_signature_extmu 5s 5s +0%
decompose 4s 2s +100%
intt_native_aarch64 4s 2s +100%
mld_ct_abs_i32 4s 3s +33%
mld_ct_cmask_neg_i32 4s 2s +100%
mld_keccakf1600x4_extract_bytes_c 4s 3s +33%
mld_prepare_domain_separation_prefix 4s 4s +0%
ntt_native_x86_64 4s 4s +0%
nttunpack_native_x86_64 4s 4s +0%
poly_caddq_c 4s 7s -43%
poly_challenge 4s 4s +0%
poly_chknorm_native 4s 3s +33%
poly_chknorm_native_aarch64 4s 3s +33%
poly_decompose 4s 4s +0%
poly_decompose_32_native_aarch64 4s 3s +33%
poly_pointwise_montgomery 4s 1s +300%
poly_reduce 4s 2s +100%
poly_uniform_eta 4s 5s -20%
poly_uniform_gamma1_4x 4s 3s +33%
poly_use_hint_native 4s 2s +100%
polyt0_pack 4s 3s +33%
polyvec_matrix_pointwise_montgomery_row 4s 3s +33%
polyvecl_chknorm 4s 4s +0%
polyvecl_unpack_z 4s 1s +300%
polyz_unpack_17_native_aarch64 4s 2s +100%
polyz_unpack_19_native_aarch64 4s 5s -20%
rej_uniform_eta_native_aarch64 4s 2s +100%
rej_uniform_native_aarch64 4s 5s -20%
shake128x4_absorb_once 4s 5s -20%
shake256_absorb 4s 2s +100%
shake256_finalize 4s 4s +0%
shake256_release 4s 2s +100%
sign_signature_pre_hash_internal 4s 3s +33%
unpack_pk_t1 4s 1s +300%
unpack_sk 4s 2s +100%
unpack_sk_s1hat 4s 1s +300%
yvec_init 4s 6s -33%
keccak_f1600_x1_native_aarch64 3s 4s -25%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_init 3s 3s +0%
keccakf1600x4_extract_bytes 3s 2s +50%
keccakf1600x4_permute 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s 1s +200%
mld_ct_cmask_nonzero_u32 3s 3s +0%
mld_ct_get_optblocker_i64 3s 2s +50%
mld_ct_get_optblocker_u8 3s 4s -25%
mld_ct_sel_int32 3s 3s +0%
mld_h 3s 4s -25%
mld_keccakf1600x4_xor_bytes_c 3s 2s +50%
mld_polymat_expand_entry 3s 2s +50%
montgomery_reduce 3s 2s +50%
pack_sk_rho_key_tr_s2 3s 2s +50%
pointwise_native_aarch64 3s 4s -25%
pointwise_native_x86_64 3s 4s -25%
poly_caddq 3s 2s +50%
poly_caddq_native_aarch64 3s 5s -40%
poly_chknorm 3s 4s -25%
poly_decompose_88_native_aarch64 3s 3s +0%
poly_invntt_tomont 3s 2s +50%
poly_invntt_tomont_native 3s 2s +50%
poly_ntt 3s 1s +200%
poly_ntt_c 3s 3s +0%
poly_pointwise_montgomery_native 3s 2s +50%
poly_uniform_gamma1 3s 3s +0%
poly_use_hint 3s 3s +0%
poly_use_hint_native_aarch64 3s 2s +50%
polyt1_pack 3s 2s +50%
polyt1_unpack 3s 4s -25%
polyveck_chknorm 3s 5s -40%
polyveck_reduce 3s 3s +0%
polyvecl_uniform_gamma1 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyw1_pack_32 3s 4s -25%
polyw1_pack_88 3s 1s +200%
polyz_pack 3s 3s +0%
polyz_unpack_native_x86_64 3s 4s -25%
reduce32 3s 3s +0%
rej_eta 3s 3s +0%
rej_eta_c 3s 3s +0%
shake128_absorb 3s 3s +0%
shake128_squeeze 3s 4s -25%
shake128x4_squeezeblocks 3s 2s +50%
shake256 3s 2s +50%
shake256_init 3s 2s +50%
shake256x4_absorb_once 3s 2s +50%
sign_signature_pre_hash_shake256 3s 3s +0%
sk_s2hat_get_poly 3s 3s +0%
sk_t0hat_get_poly 3s 2s +50%
sys_check_capability 3s 3s +0%
unpack_sk_s2hat 3s 3s +0%
use_hint 3s 2s +50%
caddq 2s 3s -33%
intt_native_x86_64 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccak_f1600_x4_native_avx2 2s 1s +100%
keccak_finalize 2s 2s +0%
keccak_squeeze 2s 3s -33%
keccakf1600_extract_bytes (big endian) 2s 3s -33%
keccakf1600_permute 2s 3s -33%
keccakf1600_xor_bytes (big endian) 2s 5s -60%
keccakf1600x4_xor_bytes 2s 4s -50%
make_hint 2s 4s -50%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_value_barrier_i64 2s 3s -33%
pack_sig_c 2s 3s -33%
pack_sig_z 2s 2s +0%
poly_caddq_native_x86_64 2s 4s -50%
poly_decompose_native 2s 2s +0%
poly_permute_bitrev_to_custom_optional_native 2s 2s +0%
poly_shiftl 2s 3s -33%
poly_sub 2s 3s -33%
poly_uniform 2s 4s -50%
polyeta_pack 2s 4s -50%
polyveck_pack_eta 2s 4s -50%
polyvecl_pointwise_acc_montgomery 2s 3s -33%
polyvecl_pointwise_acc_montgomery_native 2s 5s -60%
polyvecl_unpack_eta 2s 5s -60%
polyw1_pack 2s 6s -67%
polyz_unpack 2s 4s -50%
shake128_finalize 2s 3s -33%
shake128_init 2s 3s -33%
shake128_release 2s 3s -33%
shake256x4_squeezeblocks 2s 2s +0%
sig_unpack_hints 2s 6s -67%
sign_verify 2s 6s -67%
sk_s1hat_get_poly 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 1s +0%
keccakf1600_permute_native 1s 2s -50%
keccakf1600x4_extract_bytes_native 1s 4s -75%
mld_keccakf1600_extract_bytes 1s 2s -50%
mld_value_barrier_u32 1s 3s -67%
mld_value_barrier_u8 1s 2s -50%
pack_sig_h 1s 3s -67%
poly_permute_bitrev_to_custom_optional 1s 2s -50%
polyveck_pack_w1 1s 4s -75%
polyz_unpack_native 1s 2s -50%
shake256_squeeze 1s 1s +0%
yvec_get_poly 1s 3s -67%

@oqs-bot

oqs-bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CBMC Results (ML-DSA-65)

Full Results (205 proofs)
Proof Status Current Previous Change
**TOTAL** 2143s 2036s +5.3%
mld_invntt_layer 311s 292s +7%
polyvecl_pointwise_acc_montgomery_c 214s 208s +3%
rej_uniform_native 155s 145s +7%
polyvec_matrix_expand 136s 130s +5%
poly_pointwise_montgomery_c 103s 89s +16%
mld_ct_memcmp 71s 66s +8%
mld_attempt_signature_generation 68s 66s +3%
sign_verify_internal 66s 65s +2%
sign_signature_internal 48s 48s +0%
mld_ntt_layer 46s 43s +7%
fqmul 42s 40s +5%
polyvec_matrix_expand_serial 25s 25s +0%
keccakf1600x4_permute_native 24s 22s +9%
mld_ntt_butterfly_block 23s 21s +10%
rej_uniform 21s 22s -5%
compute_pack_t0_t1 18s 17s +6%
poly_chknorm_c 18s 20s -10%
polyveck_decompose 17s 14s +21%
polyt0_unpack 16s 15s +7%
polyvecl_chknorm 16s 15s +7%
mld_check_pct 15s 14s +7%
poly_uniform_4x 14s 9s +56%
rej_uniform_c 14s 12s +17%
poly_uniform_eta_4x 11s 10s +10%
mld_compute_pack_z 10s 9s +11%
poly_add 10s 13s -23%
poly_invntt_tomont_c 10s 11s -9%
polyveck_chknorm 10s 9s +11%
sign 10s 6s +67%
keccak_absorb_once_x4 9s 12s -25%
mld_sample_s1_s2 8s 6s +33%
pointwise_acc_native_x86_64 8s 7s +14%
polyvec_matrix_pointwise_montgomery_yvec 8s 9s -11%
polyveck_invntt_tomont 8s 9s -11%
polyvecl_ntt 8s 6s +33%
keccak_absorb 7s 9s -22%
poly_uniform_gamma1_4x 7s 3s +133%
polyveck_caddq 7s 6s +17%
polyz_unpack_c 7s 6s +17%
rej_eta_native 7s 5s +40%
unpack_sk_t0hat 7s 5s +40%
keccak_squeezeblocks_x4 6s 5s +20%
mld_keccakf1600_permute_c 6s 8s -25%
pointwise_acc_native_aarch64 6s 4s +50%
poly_ntt_native 6s 1s +500%
poly_permute_bitrev_to_custom_optional_native 6s 2s +200%
polyveck_ntt 6s 7s -14%
intt_native_x86_64 5s 4s +25%
keccakf1600_permute_native 5s 1s +400%
keccakf1600x4_permute 5s 3s +67%
mld_polymat_expand_entry 5s 1s +400%
mld_sample_s1_s2_serial 5s 5s +0%
pack_sig_h 5s 5s +0%
poly_caddq_c 5s 4s +25%
poly_caddq_native 5s 4s +25%
poly_chknorm_native 5s 4s +25%
poly_chknorm_native_aarch64 5s 3s +67%
poly_power2round 5s 4s +25%
poly_reduce 5s 2s +150%
polyeta_unpack 5s 6s -17%
polyw1_pack_88 5s 2s +150%
sign_signature_pre_hash_internal 5s 3s +67%
sign_signature_pre_hash_shake256 5s 4s +25%
sign_verify_pre_hash_internal 5s 6s -17%
sk_s1hat_get_poly 5s 3s +67%
intt_native_aarch64 4s 2s +100%
keccak_f1600_x1_native_aarch64 4s 2s +100%
keccak_f1600_x1_native_aarch64_v84a 4s 1s +300%
keccakf1600_xor_bytes (big endian) 4s 2s +100%
mld_prepare_domain_separation_prefix 4s 2s +100%
ntt_native_aarch64 4s 2s +100%
poly_caddq_native_aarch64 4s 4s +0%
poly_caddq_native_x86_64 4s 2s +100%
poly_challenge 4s 2s +100%
poly_decompose_32_native_aarch64 4s 4s +0%
poly_pointwise_montgomery 4s 2s +100%
poly_uniform 4s 3s +33%
poly_uniform_gamma1 4s 3s +33%
poly_use_hint_native_aarch64 4s 5s -20%
polyveck_pack_w1 4s 2s +100%
polyveck_unpack_eta 4s 4s +0%
polyvecl_pack_eta 4s 2s +100%
polyvecl_pointwise_acc_montgomery 4s 4s +0%
polyvecl_pointwise_acc_montgomery_native 4s 4s +0%
polyvecl_uniform_gamma1 4s 4s +0%
polyvecl_unpack_eta 4s 1s +300%
polyvecl_unpack_z 4s 3s +33%
polyw1_pack_32 4s 1s +300%
polyz_pack 4s 5s -20%
shake128x4_squeezeblocks 4s 2s +100%
sign_keypair 4s 6s -33%
sign_keypair_internal 4s 4s +0%
sign_open 4s 5s -20%
sign_pk_from_sk 4s 4s +0%
sign_signature 4s 2s +100%
sign_signature_extmu 4s 3s +33%
sign_verify_extmu 4s 3s +33%
sign_verify_pre_hash_shake256 4s 5s -20%
unpack_sk 4s 5s -20%
use_hint 4s 3s +33%
decompose 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 2s +50%
keccak_f1600_x4_native_avx2 3s 3s +0%
keccak_init 3s 1s +200%
keccakf1600_xor_bytes 3s 3s +0%
keccakf1600x4_extract_bytes 3s 3s +0%
keccakf1600x4_xor_bytes 3s 3s +0%
make_hint 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 2s +50%
mld_ct_get_optblocker_u32 3s 2s +50%
pack_sig_c 3s 5s -40%
pack_sig_z 3s 4s -25%
pack_sk_rho_key_tr_s2 3s 4s -25%
pack_sk_s1 3s 5s -40%
pointwise_native_aarch64 3s 2s +50%
pointwise_native_x86_64 3s 2s +50%
poly_chknorm 3s 3s +0%
poly_decompose 3s 2s +50%
poly_decompose_88_native_aarch64 3s 4s -25%
poly_decompose_native 3s 4s -25%
poly_invntt_tomont_native 3s 2s +50%
poly_ntt_c 3s 4s -25%
poly_permute_bitrev_to_custom_optional 3s 3s +0%
poly_pointwise_montgomery_native 3s 4s -25%
poly_sub 3s 5s -40%
poly_use_hint 3s 6s -50%
poly_use_hint_native 3s 4s -25%
polyt0_pack 3s 5s -40%
polyt1_unpack 3s 4s -25%
polyz_unpack_19_native_aarch64 3s 3s +0%
polyz_unpack_native 3s 3s +0%
power2round 3s 2s +50%
reduce32 3s 3s +0%
rej_eta 3s 2s +50%
shake128_release 3s 2s +50%
shake128x4_absorb_once 3s 4s -25%
shake256 3s 3s +0%
shake256_finalize 3s 3s +0%
shake256_release 3s 3s +0%
sig_unpack_hints 3s 3s +0%
sign_verify 3s 4s -25%
sk_s2hat_get_poly 3s 3s +0%
unpack_sk_s1hat 3s 3s +0%
unpack_sk_s2hat 3s 3s +0%
yvec_get_poly 3s 2s +50%
yvec_init 3s 3s +0%
caddq 2s 2s +0%
fqscale 2s 3s -33%
keccak_finalize 2s 3s -33%
keccak_squeeze 2s 1s +100%
keccakf1600_extract_bytes (big endian) 2s 1s +100%
keccakf1600x4_extract_bytes_native 2s 2s +0%
mld_ct_abs_i32 2s 3s -33%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_h 2s 2s +0%
mld_keccakf1600_extract_bytes 2s 2s +0%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_keccakf1600x4_xor_bytes_c 2s 3s -33%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u32 2s 1s +100%
mld_value_barrier_u8 2s 1s +100%
montgomery_reduce 2s 2s +0%
ntt_native_x86_64 2s 2s +0%
nttunpack_native_x86_64 2s 4s -50%
poly_caddq 2s 3s -33%
poly_chknorm_native_x86_64 2s 2s +0%
poly_decompose_c 2s 3s -33%
poly_invntt_tomont 2s 2s +0%
poly_ntt 2s 2s +0%
poly_shiftl 2s 4s -50%
poly_uniform_eta 2s 3s -33%
poly_use_hint_c 2s 3s -33%
polyeta_pack 2s 4s -50%
polyt1_pack 2s 4s -50%
polyvec_matrix_pointwise_montgomery_row 2s 2s +0%
polyveck_reduce 2s 3s -33%
polyvecl_uniform_gamma1_serial 2s 3s -33%
polyw1_pack 2s 5s -60%
polyz_unpack 2s 4s -50%
polyz_unpack_17_native_aarch64 2s 3s -33%
polyz_unpack_native_x86_64 2s 3s -33%
rej_eta_c 2s 2s +0%
rej_uniform_eta_native_aarch64 2s 4s -50%
rej_uniform_native_aarch64 2s 2s +0%
shake128_absorb 2s 3s -33%
shake128_finalize 2s 2s +0%
shake256_init 2s 3s -33%
shake256_squeeze 2s 3s -33%
shake256x4_absorb_once 2s 4s -50%
sk_t0hat_get_poly 2s 2s +0%
sys_check_capability 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 1s 2s -50%
keccakf1600_permute 1s 3s -67%
keccakf1600x4_xor_bytes_native 1s 3s -67%
mld_ct_cmask_neg_i32 1s 2s -50%
mld_ct_cmask_nonzero_u8 1s 3s -67%
mld_ct_get_optblocker_i64 1s 2s -50%
mld_ct_sel_int32 1s 2s -50%
polyveck_pack_eta 1s 3s -67%
shake128_init 1s 4s -75%
shake128_squeeze 1s 1s +0%
shake256_absorb 1s 2s -50%
shake256x4_squeezeblocks 1s 1s +0%
unpack_pk_t1 1s 2s -50%

@hanno-becker hanno-becker force-pushed the rv32im_backend branch 2 times, most recently from 0df94c0 to 1e15593 Compare May 15, 2026 09:00
@hanno-becker hanno-becker force-pushed the rv32im_backend branch 3 times, most recently from 66858dd to 28b6232 Compare June 16, 2026 10:34
This commit adds an experimental arithmetic backend for RISCV32-IM:

The backend includes:
- A 2+2+2+2 forward NTT
- A 2+2+2+2 inverse NTT
- A poly-poly base multiplication (but not vector-vector)

Modular arithmetic in the NTT/inverse-NTT uses Barrett multiplication
as in the AArch64 backend. A notable difference is that RV32-IM does
not provide SQRDMULH, so we fall back to a plain MULH: This affects
the 'magic constant' slightly, and also increases the output bound
of the modular multiplication to 5/4 * MLDSA_Q -- this is already
detailed in proofs/isabelle/neon_ntt. The larger multiplication bound
is reflected in a larger output bound for the forward NTT, which has
already been adopted in a previous commit and is known not to cause
conflicts with the rest of the code. For the inverse NTT, however,
a bound > MLDSA_Q for the output is problematic. Hence, for the final
scaling step in the inverse NTT, we 'emulate' an SQRDMULH using
MULH + ADD + SRAI. This kernel still needs to be proved correct
in the 'Neon' NTT paper adaptation in proofs/isabelle/neon_ntt.

We are interest in different classes of RV32-IM CPUs, some with fast,
some with slow multipliers. For CPUs with very slow multiplier, such
as Hummingbird E203 (https://github.com/riscv-mcu/e203_hbirdv2), cycles
can be saved by leveraging MLDSA_Q = 2^23 - 2^13 + 1 and rewriting
the low-multiplication by MLDSA_Q as a sequence of SHIFT+ADD/SUB.
We offer this alternative version of the backend under a compile-time
flag MLD_USE_NATIVE_RV32IM_SLOW_MULTIPLIER.

CI runs the functional, KAT, ACVP and wycheproof suites under
qemu-riscv32 for both the default and the slow-multiplier variant.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants