[Docs] Add user-guide page for atomics and bit operations by hughperkins · Pull Request #640 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-05-07T07:41:55Z

Summary

New user-guide page docs/source/user_guide/atomics.md covering the per-thread / scalar tier of Quadrants primitives — atomic read-modify-write ops (qd.atomic_*) and bit-counting helpers (qd.math.popcnt, qd.math.clz).

Companion to the in-flight subgroup (#639) and block (#638) docs; this is one of a series of tier-by-tier user-guide pages. Independent of the others — no cross-links to unmerged docs.

Includes:

Backend / dtype support matrices for all 8 atomics and the two bit-count helpers, with caveats (f64 fast path, clz u32 / u64 + SPV 32-bit hard-cap).
Per-op semantics with the universal "returns the old value" / "per-location atomicity, not a fence on the rest of memory" framing.
Worked examples: slot reservation (workhorse pattern for select / compact), histogram, popcnt sum, MSB-via-clz.
Performance and portability notes — atomic contention, fence-pairing, f64 and atomic_mul cost, clz portability.

Adds atomics to the SIMT-primitives toctree in index.md.

Test plan

cd docs && make html builds without warnings (no broken cross-links — all unmerged-doc references are plain prose, not Markdown links).
Visual check rendered page on RTD preview.

Made with Cursor

Documents qd.atomic_* (add, sub, mul, min, max, and, or, xor) and the bit-counting helpers qd.math.popcnt / qd.math.clz. Covers semantics (returns old value, per-location atomicity, no fence on the rest of memory), backend / dtype support, performance pitfalls (contention, f64 fast path, atomic_mul CAS loop), and worked examples (slot reservation, histogram, popcnt sum, MSB). Adds atomics.md to the SIMT-primitives toctree in index.md.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96eaa1904b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T07:45:11Z

+
+### `qd.math.popcnt(x)`
+
+Counts set bits in `x` and returns an `i32`. Lowers to `__popc` / `__popcll` on CUDA, `OpBitCount` on SPIR-V, `__builtin_amdgcn_popcnt` on AMDGPU. Defined for all integer dtypes.


Stop advertising AMDGPU popcnt support

When users run on qd.amdgpu, qd.math.popcnt is currently not lowered: tests/python/test_unary_ops.py::test_popcnt explicitly xfails AMDGPU with “codegen does not lower this op,” and quadrants/codegen/amdgpu/codegen_amdgpu.cpp::emit_extra_unary has no UnaryOpType::popcnt branch. This line (and the later “fully cross-backend” note) will send AMDGPU users to an API path that hits QD_NOT_IMPLEMENTED, so the docs should mark AMDGPU as unsupported until that lowering exists.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-07T07:45:11Z

+
+### `qd.math.clz(x)`
+
+Counts leading zero bits in `x` and returns an `i32`. For a 32-bit input, `clz(0) = 32`; otherwise the result is in `[0, 31]`. Lowers to `__nv_clz` / `__nv_clzll` on CUDA, `FindMSB` on SPIR-V (with `bitwidth - 1 - FindMSB` to convert MSB index into leading-zero count), `__builtin_amdgcn_sffbh_i32` on AMDGPU. See the cross-backend caveats in the support table.


Stop advertising AMDGPU clz support

For the same AMDGPU context, qd.math.clz is not implemented today: tests/python/test_unary_ops.py::test_clz xfails AMDGPU because the codegen does not lower it, and the AMDGPU unary lowering override does not handle UnaryOpType::clz. Documenting an AMDGPU builtin here makes the new user guide claim support for kernels that currently fail at codegen time.

Useful? React with 👍 / 👎.

github-actions · 2026-05-07T08:59:50Z

Diff coverage: 0% · 0 lines, 0 missing

The previous single page covered two unrelated topics: the qd.atomic_* read-modify-write ops, and the qd.math.popcnt / qd.math.clz bit-counting helpers. They share no concerns (memory coordination vs single-thread register ops) and live in different namespaces. Split into: - atomics.md -- title shortened to "Atomics". Drops the bit-counting table, the popcnt / clz semantics subsections, the bitset-popcount and Morton-MSB examples, and the bit-trick portability bullet. Adds a one-line cross-reference to math.md in the intro and the Related section. - math.md (new) -- framed as a qd.math.* page that currently documents only the bit operations. Lists the broader qd.math surface (vec / mat / GLSL ops / trig / rotations / float-classification / constants) in the intro per the project's own python/quadrants/math/ __init__.py docstring, and notes explicitly that those are exported but not yet documented here. Reuses the bit-counting table, semantics, examples, and portability notes verbatim from the previous atomics.md. Toctree gains a "math" entry next to "atomics" under the SIMT-primitives caption, preserving the existing placement of atomics rather than re-categorising as part of this split.

This was referenced May 7, 2026

[Docs] Add user-guide page for qd.simt.grid.* primitives #641

Draft

[Docs] Add user-guide page for qd.algorithms.* device-wide algorithms #642

Draft

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

hughperkins mentioned this pull request May 7, 2026

[Docs] Add user-guide page for matrix decompositions and solvers #643

Draft

3 tasks

hughperkins marked this pull request as draft May 7, 2026 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Add user-guide page for atomics and bit operations#640

[Docs] Add user-guide page for atomics and bit operations#640
hughperkins wants to merge 2 commits intomainfrom
hp/docs-atomics

hughperkins commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		### `qd.math.popcnt(x)`

		Counts set bits in `x` and returns an `i32`. Lowers to `__popc` / `__popcll` on CUDA, `OpBitCount` on SPIR-V, `__builtin_amdgcn_popcnt` on AMDGPU. Defined for all integer dtypes.


		### `qd.math.clz(x)`

		Counts leading zero bits in `x` and returns an `i32`. For a 32-bit input, `clz(0) = 32`; otherwise the result is in `[0, 31]`. Lowers to `__nv_clz` / `__nv_clzll` on CUDA, `FindMSB` on SPIR-V (with `bitwidth - 1 - FindMSB` to convert MSB index into leading-zero count), `__builtin_amdgcn_sffbh_i32` on AMDGPU. See the cross-backend caveats in the support table.

Conversation

hughperkins commented May 7, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant