Skip to content

emomaxd/llvm-pass-experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llvm-pass-experiments

Personal reference for LLVM pass writing and compiler analysis algorithms.

Not a tutorial. Working implementations of algorithms I wanted to understand by writing from scratch — dominator trees, dataflow, SSA, GPU divergence analysis, etc. Built against a custom LLVM 19+ build. New pass manager throughout.

Build

cmake -B build -DLLVM_DIR=/path/to/llvm/lib/cmake/llvm
cmake --build build -j$(nproc)

Each pass produces its own .so under build/passes/<name>/.

Run

opt -load-pass-plugin=build/passes/<dir>/<Target>.so \
    -passes="function(<pass-name>)" input.ll -disable-output

The exact command is in each file's header comment.


Passes

Analysis

pass what
domtree Dominator tree — Cooper et al. iterative algorithm. Block A dominates B if every path from entry to B passes through A.
liveness Backward dataflow. Which values are live at each point? Phi-edge aware: phi operands attributed to predecessor block, not phi's own block.

Optimizations

pass what
dce-simple Dead code elimination — mark from critical instructions backwards through def-use, sweep unmarked.
cse-simple Common subexpression elimination — domtree DFS + scoped value numbering. Same opcode + same operands = same result, reuse it.
copy-prop Trivial phi collapse, identity bitcasts, no-op arithmetic (x+0, x*1, …). Iterates to fixpoint.
loop-detect Natural loop detection via back edges. A back edge is N→H where H dominates N. Loop body = backward BFS from N stopping at H.

GPU / SIMT

Written with AMD GCN/RDNA in mind; concepts apply to any SIMT architecture.

pass what
wave-div Divergence analysis. Thread ID is the root source. Data divergence (divergent operand → result) + control divergence (divergent branch → phi at reconvergence).
vgpr-pressure Liveness of divergent values → peak VGPR count → GCN occupancy. >64 VGPRs = ≤4 waves = bad latency hiding.
mem-coalesce Classifies loads/stores: COALESCED (adjacent lanes, adjacent addresses), STRIDED (skips elements), BROADCAST (uniform address), RANDOM (scatter/gather).

Misc

pass what
logop Instruments every binary op with a call to logop(result). Original skeleton example, ported to new PM.

experiments/

Scratch files, not built.


References

  • Cooper, Harvey, Kennedy — "A Simple, Fast Dominance Algorithm" (SPE 2001)
  • Sampaio et al. — "Divergence analysis" (TOPLAS 2013)