Partcl/HRT Macro Placement Challenge

Win $20,000 by developing better macro placement algorithms!

Partcl and Hudson River Trading are excited to co-host a competition to solve the macro placement problem.

About Macro Placement

Macro placement is the problem of positioning large fixed-size blocks (SRAMs, IPs, analog macros, etc.) on a chip floorplan so that routing congestion, timing, power delivery, and area constraints are balanced. Unlike standard-cell placement, macros have strong geometric and connectivity constraints, so the challenge is to explore a highly discrete design space while minimizing wirelength, avoiding blockages, and preserving downstream routability and timing quality.

For example, the ibm01 benchmark has:

246 hard macros of varying sizes (ranging from 0.8 to 27 μm², with 33× size variation)
7,269 nets connecting macros to each other and to 894 pre-placed standard cell clusters
A 22.9 × 23.0 μm canvas with 42.8% area utilization

About HRT Hardware

Hudson River Trading (HRT) is a leading quantitative trading firm at the forefront of technical innovation in global financial markets.

HRT’s Hardware team builds the high-performance compute systems at the core of our trading infrastructure. We use FPGAs and ASICs to drive low-latency decision-making and power custom solutions across the trading stack, from bespoke circuits to machine learning accelerators.

We’re proud to sponsor this competition because advancing macro placement and low-level hardware optimization directly aligns with the kinds of hard, performance-critical engineering challenges our teams tackle every day.

Joining Hudson River Trading’s hardware team means working alongside leading engineers in one of the most advanced computing environments in global finance. Learn more about open roles at hudsonrivertrading.com.

About Partcl

Partcl is rebuilding chip design infrastructure from the ground up for the GPU era.

Modern chip design is slow, fragmented, and fundamentally constrained by tools built decades ago. Critical workflows like timing analysis and placement still take hours to days - limiting how much engineers can explore and optimize.

We’re changing that.

Partcl develops GPU-accelerated systems for physical design that run orders of magnitude faster than legacy tools. Our goal is simple: make iteration cheap enough that design space exploration becomes the default, not the exception.

Background Papers

[1] An Updated Assessment of Reinforcement Learning for Macro Placement

[2] Assessment of Reinforcement Learning for Macro Placement

[3] Reevaluating Google's Reinforcement Learning for IC Macro Placement

[4] A graph placement methodology for fast chip design

🏆 Prizes

$20,000 — Grand Prize: The top 7 submissions by proxy score are evaluated through the OpenROAD flow on NG45 designs (including hidden designs). Among those 7, the submission that beats the SA and RePlAce baselines (reported in An Updated Assessment of Reinforcement Learning for Macro Placement) by the largest margin on WNS, TNS, and Area wins the Grand Prize.
$20,000 — First Place (Proxy): Awarded to the #1 submission by proxy score. Only awarded if no submission qualifies for the Grand Prize.
$5,000 — Second Place: Awarded to the runner-up of the Grand Prize. If no submission qualifies for the Grand Prize, awarded to the #2 submission by proxy score.
$4,000 — Innovation Award: Granted to the most creative or technically innovative approach among the top entries, as determined by the judging panel.
Swag: Every valid submission gets HRT swag!
Note: An additional score adjustment will be applied based on human-expert analysis of the resulting placement.

For full Grand Prize scoring rules, feasibility gate, tie-breaking, and ORFS-failure handling, see SCORING.md.

Submission Format

All submissions will be via google form. Submissions may be made public or private before the end of judging.
Private submissions will be required to share repository with judges so they may clone/evaluate the method.
Teams may be up to 5 individuals.
The deadline for submissions is May 21, 2026, 11:59 pacific.
All teams may only submit one algorithm.
All winning implementations must be made open-source under Apache 2.0 or GPL
All submissions must be registered via this Submission Link.
All submissions must be under 1 hour end-to-end runtime (per benchmark) for the macro placement algorithm.
All submissions will be evaluated on a AMD EPYC 9655P with 16 cores + 100GB of memory and an NVIDIA RTX 6000 Ada 48GB.

Additional Rules

Allowed

Any algorithmic approach: SA, RL, GNN, analytical methods, hybrid approaches, learning-based, etc.
Any framework: PyTorch, TensorFlow, JAX, or pure Python/C++
Any optimization technique: Gradient descent, evolutionary algorithms, local search, etc.
Training on public benchmarks: You can learn from the IBM benchmark data
Hard-macro orientation flips (Klein-4 only: N, FN, FS, S) — carried to Tier 2 via an optional orientations.pt sidecar

Not Allowed

Modifying the evaluation functions (must use TILOS MacroPlacement evaluator as-is)
Hardcoding solutions for specific benchmarks (must be general algorithm)
Using external/proprietary placement tools (must be open-source submission)
Exceeding runtime limits (1 hour per benchmark hard timeout)
Overlaps in resulting placement (strictly zero overlap between hard macros — no tolerance. Participants should add small gaps in their legalization to avoid float-precision edge cases.)
90° rotations of hard macros (R90, R270, FE, FW) — the fakeram45 SRAMs in our benchmarks aren't designed for rotation (pin access and internal metal direction assume a fixed orientation)
Resizing soft macros — soft-macro size is a proxy-only concept for density/congestion that doesn't translate to Tier 2; sizes are locked to the initial .plc values on every compute_proxy_cost call

Evaluation Details

Evaluation is two-tiered:

Tier 1: Proxy Cost Ranking (All Submissions)

All submissions are ranked by proxy cost across the 18 IBM benchmarks. This is the primary qualifying metric. Proxy cost is computed using the TILOS MacroPlacement evaluator:

Proxy Cost = 1.0 × Wirelength + 0.5 × Density + 0.5 × Congestion

Baseline numbers are from: An Updated Assessment of Reinforcement Learning for Macro Placement

Tier 2: OpenROAD Flow Validation (Top Submissions)

The top 7 submissions by proxy score will be evaluated through the full OpenROAD flow on NG45 designs to measure real PnR outcomes: WNS, TNS, and Area.

The Grand Prize ($20K) is awarded to the highest-scoring submission using a geometric mean of improvement ratios across WNS, TNS, and Area vs. the average SA/RePlAce baseline.
To qualify, submissions must pass a feasibility gate — timing (WNS, TNS) cannot regress below both baselines on any design.
To avoid overfitting, we will also evaluate on 1-2 hidden NG45 designs.
Full scoring rules: SCORING.md

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/partcleda/partcl-macro-place-challenge.git
cd partcl-macro-place-challenge

# Initialize TILOS MacroPlacement submodule (required for evaluation)
git submodule update --init external/MacroPlacement

# Install the package and all dependencies
uv sync

# Verify the setup
uv run evaluate submissions/examples/greedy_row_placer.py -b ibm01

Run Your First Example

# Run the greedy row placer on ibm01
uv run evaluate submissions/examples/greedy_row_placer.py

# Run on all 17 IBM benchmarks
uv run evaluate submissions/examples/greedy_row_placer.py --all

# Run on NG45 commercial designs (ariane133, ariane136, mempool_tile, nvdla)
uv run evaluate submissions/examples/greedy_row_placer.py --ng45

# Visualize the result
uv run evaluate submissions/examples/greedy_row_placer.py --vis
uv run evaluate submissions/examples/greedy_row_placer.py --all --vis

Running on all benchmarks produces a summary like:

Benchmark     Proxy        SA   RePlAce     vs SA  vs RePlAce  Overlaps
   ibm01    2.0463    1.3166    0.9976    -55.4%     -105.1%         0
   ibm02    2.0431    1.9072    1.8370     -7.1%      -11.2%         0
   ...
     AVG    2.2109    2.1251    1.4578     -4.0%      -51.7%         0

The greedy placer achieves zero overlaps but makes no attempt to optimize wirelength or connectivity — your job is to do better! See SETUP.md for the full API reference and submissions/examples/ for working examples.

🎯 IBM Benchmark Suite (ICCAD04)

We evaluate on the complete ICCAD04 IBM benchmark suite:

Benchmark	Macros	Nets	Canvas (μm)	Area Util.	SA Baseline	RePlAce Baseline
ibm01	246	7,269	22.9×23.0	42.8%	1.3166	0.9976 ⭐
ibm02	254	7,538	23.2×23.5	43.1%	1.9072	1.8370 ⭐
ibm03	269	8,045	24.1×24.3	44.2%	1.7401	1.3222 ⭐
ibm04	285	8,654	24.8×25.1	44.8%	1.5037	1.3024 ⭐
ibm06	318	9,745	26.1×26.5	46.1%	2.5057	1.6187 ⭐
ibm07	335	10,328	26.8×27.2	46.8%	2.0229	1.4633 ⭐
ibm08	352	10,901	27.5×27.9	47.4%	1.9239	1.4285 ⭐
ibm09	369	11,463	28.1×28.5	48.0%	1.3875	1.1194 ⭐
ibm10	387	12,018	28.8×29.2	48.6%	2.1108	1.5009 ⭐
ibm11	405	12,568	29.4×29.8	49.2%	1.7111	1.1774 ⭐
ibm12	423	13,111	30.1×30.5	49.8%	2.8261	1.7261 ⭐
ibm13	441	13,647	30.7×31.1	50.4%	1.9141	1.3355 ⭐
ibm14	460	14,178	31.4×31.8	51.0%	2.2750	1.5436 ⭐
ibm15	479	14,704	32.0×32.4	51.6%	2.3000	1.5159 ⭐
ibm16	498	15,225	32.7×33.1	52.2%	2.2337	1.4780 ⭐
ibm17	517	15,741	33.3×33.7	52.8%	3.6726	1.6446 ⭐
ibm18	537	16,253	34.0×34.4	53.4%	2.7755	1.7722 ⭐

Each benchmark includes:

Hard macros (you place these)
Soft macros (you can also place these)
Nets connecting all components
Initial placement (hand-crafted, serves as reference)

Baseline Analysis:

RePlAce (⭐) consistently outperforms SA across all benchmarks
RePlAce achieves 15-55% lower proxy cost than SA
To qualify for the Grand Prize, your placement must also produce better WNS, TNS, and Area than both baselines when evaluated through the OpenROAD flow on NG45 designs
Both baselines achieve zero overlaps (enforced as hard constraint)

💡 Why This Is Hard

Despite "only" 246-537 macros, this problem is extremely challenging:

Massive search space: ~10^800 possible placements (even with constraints)
Conflicting objectives: Wirelength wants clustering, density wants spreading, congestion wants routing space
Non-convex landscape: Millions of local minima, discontinuities, plateaus
Long-range dependencies: Moving one macro affects costs globally through thousands of nets
Hard constraints: No overlaps between heterogeneous sizes (33× size variation)
Tight packing: 43-53% area utilization leaves little slack
Runtime matters: Must be fast enough to be practical (< 5 minutes ideal)

Classical methods (SA, RePlAce) have been refined for decades but still have room for improvement!

📖 Documentation

Setup & API Reference: SETUP.md - Infrastructure details, benchmark format, cost computation, testing
Example Submissions: submissions/examples/ - Working placer examples

📚 References

TILOS MacroPlacement: GitHub Repository
- Source of evaluation infrastructure
- ICCAD04 benchmarks
- SA and RePlAce baseline implementations
ICCAD04 Benchmarks: Classic macro placement benchmark suite used in academic research

🏅 Leaderboard

Submissions are ranked by average proxy cost across all 17 IBM benchmarks (lower is better). Zero overlaps required on all benchmarks. Scores are unverified until confirmed by judges.

Rank	Team	Avg Proxy Cost	Best	Worst	Overlaps	Runtime	Verified	Notes
1	"vmallela"	1.0109	0.7644	1.2921	0	15.5h total	✅	Verified 1.0109 (self-reported 1.1)
2	"Cezar"	1.037	—	—	0	55min/bench		Resubmitted 5/3. Verification blocked on missing `steps/analytical.py` import. Previous variant verified 1.2224.
3	"KLA MACH"	1.2121	0.8527	1.6532	0	2h15min total	✅	Verified 1.2121 (self-reported 1.2355). Consolidates UTDA / Chuanqi Chen / KLA MACH submissions (one algorithm per team).
4	"Hoop Dreams"	1.2206	—	—	0	20min/bench		New 5/1. Verification blocked on Python 3.12 / 3.11 ABI mismatch.
5	"Shoom"	1.2353	—	—	0	42min/bench		Resubmitted 5/1 (was 1.3381).
6	"ArzunPD"	1.2478	—	—	0	55min/bench		Resubmitted 5/1 (was HyperPlace, verified 1.4421).
7	"RoRa"	1.2788	0.9577	1.6222	0	2.6h total	✅	Verified 1.2788 (self-reported 1.2723). Resubmitted 5/1.
8	"William Zhang"	1.2767	—	—	0	259s/bench		Resubmitted 5/2 (was "Convex Optimization", verified 1.4556). Blocked on missing `casadi` module.
9	"MTK"	1.2818	0.9073	1.6529	0	37s/bench (GPU)	✅	Verified 1.2818 (self-reported 1.317).
10	"Electric Beatle"	1.3253	—	—	0	2000s/bench (GPU)		Resubmitted 4/30 (was verified 1.3913).
11	"UToronto Analytical"	1.3323	0.9371	1.6545	0	24min total	✅	Verified 1.3323 (self-reported 1.3325).
12	"V5"	1.3382	—	—	0	850s/bench		New 4/23.
13	"Archgen"	1.3479	—	—	0	2404s total		New 4/24.
14	"Varun's Parallel Worlds"	1.4017	1.0362	1.7298	0	27s/bench	✅
15	"UT Austin - AS"	1.4076	—	—	0	17s/bench
16	"ByteDancer"	1.4151	1.0236	1.7792	0	38min/bench	✅
17	"TAISPlAce"	1.4321	—	—	0	28min/bench
18	"Two-IIITK-Kids"	1.436	—	—	0	38min/bench		New 5/2, resubmitted 5/4.
19	"Pragnay"	1.4427	—	—	0	632s/bench		Blocked on `compute_proxy_cost(..., plc=None)` in fallback path.
20	"No Man's Sky"	1.4445	—	—	0	8.8min/bench		New 5/4. Repo not accessible.
21	"another Waterloo kid"	1.4568	—	—	0	118s/bench		Blocked on Modal cloud dispatch — can't run air-gapped.
—	RePlAce (baseline)	1.4578	0.9976	1.8370	0	—	✅
22	"W3 Solutions"	1.4824	—	—	0	90s/bench		Runtime exceeds 1h/bench cap.
23	"Jiangban Ya"	1.4943	1.0891	1.8099	0	49s/bench	✅
24	"UTAUSTIN-CT"	1.5062	1.1363	1.7941	0	6s/bench	✅
25	"oracleX"	1.5130	1.1340	1.7937	0	11s/bench	✅
26	"SEVmakers"	1.5200	—	—	0	200s/bench		Private repo — pending judge access.
27	"CA"	1.5247	1.2226	1.7945	0	2s/bench	✅	Verified 1.5247 (self-reported 1.5238).
28	"#5 ubc cpen student"	1.5337	1.1411	1.8084	0	13s/bench	✅
29	Will Seed (Partcl)	1.5338	1.1625	1.7965	0	35s total	✅
30	"RUDY Can't Fail"	1.5397	1.1927	1.8881	0	6min total	✅	Verified 1.5397 (self-reported 1.3605).
31	"UT Austin - RH"	1.6037	—	—	0	4.5s/bench
32	"UT Austin - CT"	1.8706	—	—	0	187s/bench
33	"AS"	1.9121	1.4614	2.3508	0	0.16s total	✅
34	"Adi's Team"	2.0025	—	—	0	3726s/bench		Blocked on `compute_proxy_cost(skip_congestion=True)` kwarg.
35	"Sharc #1"	2.0433	1.5143	2.4336	0	96s/bench	✅
—	SA (baseline)	2.1251	1.3166	3.6726	0	—	✅
—	Greedy Row (demo)	2.2109	1.6728	2.7696	0	0.3s total	✅
—	"Binghamton"	pending	—	—	—	—
—	"MacroBio"	pending	—	—	—	—
DQ	"Mike Gao"	self-reported 1.3255	—	—	1939	16min/bench		1939 overlaps across 17 benchmarks.
DQ	"BakaBobo"	self-reported 1.4044	—	—	—	282s/bench		Missing import — code won't run.

Submit your results via the Submission Link!

🤔 FAQ

Q: What benchmarks are used? A: Tier 1 (proxy cost) uses 17 IBM ICCAD04 benchmarks — the standard academic suite with well-established baselines. Tier 2 (OpenROAD flow) uses NG45 commercial designs (ariane133, ariane136, mempool_tile, nvdla) plus 1-2 hidden designs. You can evaluate on both with --all (IBM) and --ng45 (NG45).

Q: What if I beat one baseline but not the other? A: You must beat BOTH SA and RePlAce baselines on WNS, TNS, and Area to qualify for the Grand Prize. You can still win the Proxy or Innovation prizes regardless.

Q: Are there hidden test cases? A: All 17 IBM benchmarks for proxy cost ranking are public. The 4 NG45 designs are also public. For the OpenROAD flow evaluation (Tier 2), we will additionally test on 1-2 hidden NG45 designs to ensure generalization.

Q: What counts as "beating" the baseline? A: For proxy cost (Tier 1), your aggregate score across all IBM benchmarks must be lower than the baselines. For the Grand Prize (Tier 2), your OpenROAD results for WNS, TNS, and Area must surpass both SA and RePlAce baselines on NG45 designs.

📧 Contact

Issues: GitHub Issues
Email: contact@partcl.com

📄 License

This project is licensed under the Apache License 2.0 - see LICENSE.md for details.

Competition Updates

The organizers may update or clarify rules, evaluation details, timelines, prizes, or infrastructure as needed to ensure fairness, technical accuracy, and smooth operation of the competition. Any updates will be communicated through official channels and will apply going forward.

Participation in the competition constitutes acceptance of the current rules and any subsequent updates. The organizers’ decisions regarding scoring, eligibility, and interpretation of these rules are final.

Submissions & contact information may be shared with sponsors.

Name		Name	Last commit message	Last commit date
Latest commit History 189 Commits
assets		assets
baselines		baselines
benchmarks		benchmarks
eval_docker		eval_docker
external		external
macro_place		macro_place
scripts		scripts
src/orfs_integration		src/orfs_integration
submissions		submissions
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
README.md		README.md
SCORING.md		SCORING.md
SETUP.md		SETUP.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Partcl/HRT Macro Placement Challenge

About Macro Placement

About HRT Hardware

About Partcl

Background Papers

🏆 Prizes

Submission Format

Additional Rules

Allowed

Not Allowed

Evaluation Details

Tier 1: Proxy Cost Ranking (All Submissions)

Tier 2: OpenROAD Flow Validation (Top Submissions)

🚀 Quick Start

Installation

Run Your First Example

🎯 IBM Benchmark Suite (ICCAD04)

💡 Why This Is Hard

📖 Documentation

📚 References

🏅 Leaderboard

🤔 FAQ

📧 Contact

📄 License

Competition Updates

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Partcl/HRT Macro Placement Challenge

About Macro Placement

About HRT Hardware

About Partcl

Background Papers

🏆 Prizes

Submission Format

Additional Rules

Allowed

Not Allowed

Evaluation Details

Tier 1: Proxy Cost Ranking (All Submissions)

Tier 2: OpenROAD Flow Validation (Top Submissions)

🚀 Quick Start

Installation

Run Your First Example

🎯 IBM Benchmark Suite (ICCAD04)

💡 Why This Is Hard

📖 Documentation

📚 References

🏅 Leaderboard

🤔 FAQ

📧 Contact

📄 License

Competition Updates

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages