Skip to content

Leaderboard submission - Ashvin Verma#59

Open
ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
ashvin-verma:main
Open

Leaderboard submission - Ashvin Verma#59
ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
ashvin-verma:main

Conversation

@ashvin-verma
Copy link
Copy Markdown

Adds Ashvin Verma to both the updated and old leaderboard tables.

New leaderboard result:

  • Avg overlap: 0.0000
  • Avg normalized WL: 0.3818
  • Runtime: 825.14s

Old leaderboard comparable result:

  • Avg overlap: 0.0000
  • Avg normalized WL: 0.3326
  • Runtime: 699.03s

Note: Selective projected GD + WL-aware legalizer

Best result CSV: ashvin/results/20260421_124936_selective_projected_shelf_v2_full_suite.csv

ashvin-verma and others added 30 commits March 21, 2026 14:48
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline: annealed softplus overlap loss → density penalty → deterministic
legalization → greedy repair. Scalable spatial hash handles 100K+ cells.

Key components in ashvin/:
- solver.py: single-stage annealed solver (softplus beta 0.1→6.0, lambda ramp)
- overlap.py: two-tier spatial hash (macros exhaustive, std cells binned)
- density.py: bilinear density penalty
- legalize.py: row-based packing with macro repair
- repair.py: greedy nudge with brute-force fallback
- config.py: preset configs for optuna tuning
- run_tests.py: instrumented runner with CSV output
- view.py: versioned placement visualizations

Results: 0.0000 overlap on tests 1-10 (verified on alternate seeds).
WL ~0.51 (leaderboard partcleda#1 is 0.13 — Phase 2 target).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key fixes:
- Macro-macro repair pass in legalization (resolves stale position bug)
- Obstacle re-checking loop in row packing (catches cascading shifts)
- Brute-force repair fallback for small N (catches bin-boundary misses)
- Iterative legalize-repair cycles (converges for 100K cells)
- Adaptive epoch scaling (200 epochs for N>10K, 500 for N>2K)

Test 12 (100K cells): 0.0000 overlap in 721s — previously 0.1491.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added ashvin/wl_optimize.py with gradient WL polish: runs GD on wirelength
only (macros frozen), then re-legalizes to maintain zero overlap. 3 cycles
with decreasing LR. WL improved 0.5132 → 0.4971 on tests 1-10.

Bottleneck identified: row-based legalization adds ~0.05 WL penalty per pass.
GD achieves 0.40 WL but legalization snaps it to 0.45+. Need minimal-disturbance
legalization or cell swap post-processing for competitive WL (~0.13).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cell swap: swap same-height nearby cells if WL improves without overlap.
Gradient polish: GD on WL only → re-legalize, 3 cycles.

WL improved: 0.5132 → 0.4912 on tests 1-10. Still 0.0000 overlap.
Test 10 swap phase is slow (509s) due to O(N) overlap check per swap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/tune.py: optuna hyperparameter search (30 trials in 3 min)
- ashvin/wl_optimize.py: cell swap + gradient WL polish
- ashvin/solver.py: LR schedule support (warmup, warmup_cosine, constant)

Optuna best config: WL 0.4544 (from 0.4971), 0.0000 overlap on all tests.
Key finding: softer beta (2.09), lower LR (0.003), fewer epochs (500) wins.
Config saved to ashvin/results/best_config.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/legalize.py: legalize_min_disturbance() nudges minimum distance
  (experimental — row-based still default, more reliable)
- ashvin/view.py: edge visualization shows long edges in red, WL in title
- ashvin/run_tests.py: --no-wl-polish flag for faster benchmarking
- ashvin/tune.py: lr_schedule as tunable param
- ashvin/solver.py: lr_schedule support (warmup, warmup_cosine, constant)

Min-disturbance was worse on WL (0.57 vs 0.52) due to cascading nudges.
Reverted to row-based as default. WL gap (0.45 vs 0.13) is fundamentally
about GD quality, not legalization approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/init_placement.py: spectral (eigenvector) initial placement
  via graph Laplacian — places connected cells near each other
- ashvin/solver.py: solve_multistart() tries random + spectral init,
  picks best WL result with 0 overlap
- ashvin/wl_optimize.py: cell_swap uses spatial hash for O(1) overlap
  check instead of O(N) — 500x faster on test 10

Multi-start WL: 0.4468 (from 0.4544). Some tests see big gains from
spectral init (test 3: 0.63→0.33, test 9: 0.50→0.37), others prefer
random. Multi-start picks the best per test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
100 trials, best trial 97. Key: higher lambda_wl (3.58), warmup_cosine LR,
low overlap start (1.23), soft beta (2.03). WL improved 0.45 → 0.41.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Barycentric refinement: move cells toward centroid of neighbors,
  accept if no overlap. ~2% WL improvement, fast, always on.
- Scatter solver: explosive scatter + reconverge. Tested scatter
  factors 1.0-2.0 — doesn't help (disrupted solutions don't find
  better minima).
- Disabled slow cell swap + GD polish by default.

Best WL: ~0.45 with optuna config + barycentric. Rank ~10.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New technique: identify cells with longest edges (top 20%), move them
toward their connected neighbors' centroid, then re-solve with short GD.
This breaks local WL minima by relocating problematic cells.

Results: 0.4015 avg WL on tests 1-9, 0.0000 overlap. Best single test:
0.3361 (test 7). Approaching leaderboard rank 10 (Valouev at 0.3577).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Multi-scatter: 3 iterations of targeted scatter+reconverge. Each round
  identifies new high-WL cells and relocates them. WL 0.40 → 0.3842.
- Nuclear/SEMF loss: LJ and Bethe-Weizsacker inspired potentials tested.
  Negligible impact — redundant with existing WL loss.
- Updated PROGRESS.md with all experiments (Runs 14-21).

Leaderboard: rank ~9-10 (between Valouev 0.3577 and Del Monte 0.3427).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Region-aware pull toward macros hurt WL (0.387 vs 0.372) — disrupts GD positions.
5 scatter iterations saturate at 3 (no improvement after).
Best avg WL: 0.3687. Best individual: 0.2592 (test 10), 0.3232 (test 7).
Nuclear/SEMF loss experiments documented.

Final position: 0.0000 overlap, 0.37 WL, rank ~9.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline with fixed-point iteration: each pass does barycentric refinement,
targeted scatter, short GD on WL only, then re-legalize. Tracks best WL
and reverts if a pass doesn't improve.

WL: 0.3695 avg (tests 1-10). Best individual: 0.2620 (test 10).
0.0000 overlap maintained throughout pipeline.

Pipeline passes saturate at ~3. Barycentric O(N²) overlap check is the
speed bottleneck for large tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Barycentric: momentum (0.7) accumulates velocity across passes,
  spatial hash for O(1) overlap check instead of O(N)
- Legalization: sort cells by macro affinity region before row packing
  so connected cells stay near their macro
- Pipeline passes with fixed-point iteration and best-so-far tracking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ashvin/net_legalize.py: initial implementation of net-aware legalization
that scores candidate slots by alpha*displacement + beta*WL_delta.
Not yet integrated into solver pipeline — needs planning + testing.

Based on Abacus/BonnPlace literature: legalization should minimize
displacement AND wirelength delta, not just pack into rows blindly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline: row-packing guarantees zero overlap, then net-aware legalizer
tries each cell at barycentric target + gap positions, scoring by
alpha*displacement + beta*WL_delta. Reverts if no improvement.

Best results: 0.3613 avg WL (tests 1-10), 0.0000 overlap.
Test 10: 0.2292 WL (was 0.2592). Test 7: 0.3177 (was 0.3232).

Leaderboard position: ~rank 9 (between Valouev 0.3577 and Paleja 0.3311).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ashvin/detailed.py: post-legalization WL optimization via:
- Pair swap: swap same-height neighbors if WL improves
- Reinsertion: move worst-WL cells to barycentric target gaps

Capped at N<=300 due to O(N*bins) swap evaluation cost.
Improved WL on small tests: test 7: 0.32→0.31, test 8: 0.34→0.32.
Best avg WL: 0.3613 (tests 1-10), 0.0000 overlap.

Pipeline: GD → net-aware legalize → repair → barycentric → scatter →
[scatter+reconverge]×3 → detailed placement (swaps+reinsertion)

Leaderboard: rank ~9 (0.36 WL, between Valouev 0.36 and Paleja 0.33).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full pipeline results (tests 1-10): 0.0000 overlap, 0.3540 WL.
Test 7: 0.3059, Test 10: 0.2292. Just 0.01 behind Del Monte (partcleda#9).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three structural changes targeting root cause (GD→legalization WL damage):

1. Cell inflation (8%): inflate sizes during GD so overlap penalty spreads
   cells further apart. Deflate before legalization → natural gaps →
   legalization needs minor corrections only.

2. Anchor loss: after legalization, GD refinement tethered to legal
   positions via lambda_anchor * ||pos - anchor||^2. Prevents cells from
   drifting far, so next legalization is a small correction.

3. Topology-preserving legalization: re-center compacted rows at GD
   centroid instead of always pushing rightward.

Also adds:
- WL-priority legalization (wl_legalize.py)
- Row reordering + cross-row reinsertion (global_swap.py)
- SA refinement (sa_refine.py) — minimal impact, not in default pipeline
- Optuna v2 config (best_config_v2.json)
- Multistart with wl_priority + spectral variants

Results: avg WL 0.368 → 0.358 (2.7% improvement, all 9 tests improved).
With multistart, test 3 reaches 0.324 (22% improvement via spectral init).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New swap_engine.py with two move types:
- Within-row swap: exchange cell ordering, recompact (always legal)
- Cross-row reinsertion: remove from source row, insert near barycentric
  target in destination row

Test 1 improves 0.387→0.369 (+4.8%) from cross-row moves.
Tests 2-9 mixed (slight regressions on some from within-row swaps).

Also reverts topology-preserving legalization (caused regression).
Keeps cell inflation + anchor loss which give consistent improvement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Plan: interleaved legalize-GD → legalization-aware GD → constructive placement
Visual analysis shows Abacus fails when GD clusters need compaction (test 3)
but wins when preserving GD neighborhoods (test 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Interleaved legalize-GD tested and rejected: -0.2% to -4.8% on all tests.
Mid-training legalization disrupts Adam momentum. The current pipeline
(full GD → legalize → anchored-GD-polish × 5) is already the right structure.

Saved constructive placement plan (island clustering approach):
form legal blocks → promote to macro units → coarse/fine placement with LR schedule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Row penalty (sin²πy ramped in main GD loop) helps T2 +6.1% but hurts
T1 -5%, T3 -1.9%. Same pattern as all GD modifications: helps some tests,
hurts others, net flat. GD→legalize architecture has ceiling ~0.35-0.36.

Moving to Step 3: constructive placement (island clustering).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of replacing GD, use island clustering to create connectivity-aware
initial positions that feed into the existing GD pipeline.

Island init wins tests 1 (0.400) and 2 (0.313, best ever).
Spectral wins test 3 (0.311, best ever). Greedy wins tests 4-5.
Multistart picks best per test automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added force-directed init (iterative neighbor averaging) and sequential
placement (degree-ordered, near placed neighbors). Compared all 4:
random 0.371 > force_dir 0.391 > sequential 0.400 > spectral 0.428

GD is robust to random init. Connectivity-aware inits cluster too
tightly, making overlap harder. Init is NOT the bottleneck.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cks it

Rewrote Abacus to use barycentric WL targets instead of displacement.
Raw comparison: beats greedy 7/9 tests by 1-4%.
Full pipeline: regresses because pipeline passes are co-adapted to greedy.
Reverted to greedy-only. Logged findings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Place cells one-by-one at WL-optimal positions (most-connected first),
then swap refinement. No GD, no legalization, no overlap loss.

Wins tests 1,3,4 (WL 10-12% better than GD pipeline).
Loses on larger tests — swap engine needs more iterations.
Residual overlap 3-40% — compaction needs work.
Fast runtime (3-42s vs 30-300s for GD pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Project cells to nearest macro boundary when overlap detected.
Macro overlaps reduced from 5-16 pairs to 1-6.
WL beats GD pipeline on tests 1,4,9 without legalization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Precompute macro-blocked x-intervals per row before placing any cell.
All cell placements use best_legal_x() to find nearest legal position.
Macro legalization runs first to separate overlapping macros.
compact_row skips blocked intervals when pushing cells right.

Result: 0.0000 overlap on all 9 tests. WL avg 0.413 (worse than GD
pipeline 0.358) because macro legalization pushes macros too far apart.
Need better macro positioning next.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ashvin-verma and others added 15 commits March 23, 2026 23:45
Constructive v2 now achieves 0.0000 overlap on all 9 tests.
Macro gaps + blocked interval skipping in compact_row.
Spatially-aware placement order (cells sorted by anchor macro position).

WL avg 0.447 (worse than GD 0.358) — macro gaps push cells far from
optimal. T2 beats GD (0.331 vs 0.338). Swap engine needs more
iterations and better moves to recover WL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reduced macro gaps (base 0.5, per-shared 0.3)
- Added within-row swaps back with proper both-cell WL evaluation
- Track swapped pairs to prevent oscillation
- Added macro legalization before blocked interval computation

Zero overlap on 7/9 tests. T2 beats GD (0.326 vs 0.338).
Avg WL 0.419 vs GD 0.358 — macro placement still too spread.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tests

- Macro gaps sized by shared cell width / rows spanned
- Bidirectional compact: right sweep resolves overlaps, left sweep pulls
  cells back toward targets. Distributes displacement symmetrically.
- Deferred compaction: place all cells first, compact once at end
- Zero overlap on ALL 9 tests
- T2 best ever: 0.323 (beats GD pipeline 0.338)
- Avg WL 0.417 vs GD 0.358 — gap from cell distribution, not overlaps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1: iterative barycentric averaging (20 iters, 0.7 damping)
  Places all cells at connectivity-optimal positions (overlapping)
Phase 2: snap to rows, spread via bidirectional compaction
  Minimal displacement from targets

Dropped macro gaps (were hurting, not helping).
Zero overlap all 9 tests. Avg WL 0.406 vs GD 0.358.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BFS places cells in connectivity order from macros but has ordering
dependency. Iterative averaging optimizes all connections simultaneously.
Reverted to averaging. Logged BFS results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mple

Plots confirm all 3 phase 1 variants (avg, BFS, BFS+avg) produce
identical cell clusters. The averaging dominates. Phase 1 is not
the bottleneck — phase 2 (spreading overlapping cluster into rows)
is where WL is lost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When cells overlap, try moving less-connected cell to adjacent row
first. Only push right if no room in adjacent rows. Distributes
cells across more rows, reducing cascade pushes.

Improves 6/9 tests. T5 +0.010, T7 +0.013. Zero overlap on 8/9.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GD for phase 1 (optimal overlapping positions) + constructive
phase 2 (WL-aware spreading). Zero overlap all tests.

T2=0.315 is best ever on any test. But avg 0.402 vs GD pipeline
0.358 — spreading adds 0.15 WL vs legalization's 0.11.
The constructive spreading is worse than greedy legalization at
preserving WL. Need fundamentally better spreading.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stripped cross-row moves from spreading. Within-row compact gives
identical results (0.406 constr, 0.403 hybrid). Confirms y-assignment
is the bottleneck — cells cluster in too few rows after averaging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move cells from overloaded rows to adjacent ones to reduce x-compaction.
Result: avg unchanged (0.406 constr, 0.403 hybrid). The y-displacement
from cross-row moves cancels the x-compaction benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Distribute cells across rows with capacity limits instead of all going
to nearest row. Soft repulsion tested (zero effect — most nearby cells
ARE neighbors). Load balancing helps T1/T3/T6/T9 but hurts T5/T8.

Spreading improvements are at diminishing returns. The 0.405 avg
with zero overlap is the constructive baseline. Next: swap engine
improvements or completely different spreading approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace defaultdict + nested Python loops with sort-based binning,
torch.unique_consecutive for bin boundaries, and vectorized meshgrid
for within-bin pair generation. Correctness verified on T4/T9.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
T5: 52s->26s (2x), T6: 126s->51s (2.5x), T9: 587s->341s (1.7x).
WL unchanged, zero overlap maintained. Torch sort-based binning
replaces Python defaultdict loops in pair generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Brute-force O(N^2) pairwise distance check using torch broadcasting
for std cells. No Python loops at all for N<=2000. Sweepline fallback
for larger N. T4: 32s->28s. Correctness verified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant