Leaderboard submission - Ashvin Verma by ashvin-verma · Pull Request #59 · partcleda/intern_challenge

ashvin-verma · 2026-04-22T07:01:56Z

Adds Ashvin Verma to both the updated and old leaderboard tables.

New leaderboard result:

Avg overlap: 0.0000
Avg normalized WL: 0.3818
Runtime: 825.14s

Old leaderboard comparable result:

Avg overlap: 0.0000
Avg normalized WL: 0.3326
Runtime: 699.03s

Note: Selective projected GD + WL-aware legalizer

Best result CSV: ashvin/results/20260421_124936_selective_projected_shelf_v2_full_suite.csv

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pipeline: annealed softplus overlap loss → density penalty → deterministic legalization → greedy repair. Scalable spatial hash handles 100K+ cells. Key components in ashvin/: - solver.py: single-stage annealed solver (softplus beta 0.1→6.0, lambda ramp) - overlap.py: two-tier spatial hash (macros exhaustive, std cells binned) - density.py: bilinear density penalty - legalize.py: row-based packing with macro repair - repair.py: greedy nudge with brute-force fallback - config.py: preset configs for optuna tuning - run_tests.py: instrumented runner with CSV output - view.py: versioned placement visualizations Results: 0.0000 overlap on tests 1-10 (verified on alternate seeds). WL ~0.51 (leaderboard partcleda#1 is 0.13 — Phase 2 target). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Key fixes: - Macro-macro repair pass in legalization (resolves stale position bug) - Obstacle re-checking loop in row packing (catches cascading shifts) - Brute-force repair fallback for small N (catches bin-boundary misses) - Iterative legalize-repair cycles (converges for 100K cells) - Adaptive epoch scaling (200 epochs for N>10K, 500 for N>2K) Test 12 (100K cells): 0.0000 overlap in 721s — previously 0.1491. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Added ashvin/wl_optimize.py with gradient WL polish: runs GD on wirelength only (macros frozen), then re-legalizes to maintain zero overlap. 3 cycles with decreasing LR. WL improved 0.5132 → 0.4971 on tests 1-10. Bottleneck identified: row-based legalization adds ~0.05 WL penalty per pass. GD achieves 0.40 WL but legalization snaps it to 0.45+. Need minimal-disturbance legalization or cell swap post-processing for competitive WL (~0.13). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Cell swap: swap same-height nearby cells if WL improves without overlap. Gradient polish: GD on WL only → re-legalize, 3 cycles. WL improved: 0.5132 → 0.4912 on tests 1-10. Still 0.0000 overlap. Test 10 swap phase is slow (509s) due to O(N) overlap check per swap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- ashvin/tune.py: optuna hyperparameter search (30 trials in 3 min) - ashvin/wl_optimize.py: cell swap + gradient WL polish - ashvin/solver.py: LR schedule support (warmup, warmup_cosine, constant) Optuna best config: WL 0.4544 (from 0.4971), 0.0000 overlap on all tests. Key finding: softer beta (2.09), lower LR (0.003), fewer epochs (500) wins. Config saved to ashvin/results/best_config.json. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- ashvin/legalize.py: legalize_min_disturbance() nudges minimum distance (experimental — row-based still default, more reliable) - ashvin/view.py: edge visualization shows long edges in red, WL in title - ashvin/run_tests.py: --no-wl-polish flag for faster benchmarking - ashvin/tune.py: lr_schedule as tunable param - ashvin/solver.py: lr_schedule support (warmup, warmup_cosine, constant) Min-disturbance was worse on WL (0.57 vs 0.52) due to cascading nudges. Reverted to row-based as default. WL gap (0.45 vs 0.13) is fundamentally about GD quality, not legalization approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- ashvin/init_placement.py: spectral (eigenvector) initial placement via graph Laplacian — places connected cells near each other - ashvin/solver.py: solve_multistart() tries random + spectral init, picks best WL result with 0 overlap - ashvin/wl_optimize.py: cell_swap uses spatial hash for O(1) overlap check instead of O(N) — 500x faster on test 10 Multi-start WL: 0.4468 (from 0.4544). Some tests see big gains from spectral init (test 3: 0.63→0.33, test 9: 0.50→0.37), others prefer random. Multi-start picks the best per test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

100 trials, best trial 97. Key: higher lambda_wl (3.58), warmup_cosine LR, low overlap start (1.23), soft beta (2.03). WL improved 0.45 → 0.41. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Barycentric refinement: move cells toward centroid of neighbors, accept if no overlap. ~2% WL improvement, fast, always on. - Scatter solver: explosive scatter + reconverge. Tested scatter factors 1.0-2.0 — doesn't help (disrupted solutions don't find better minima). - Disabled slow cell swap + GD polish by default. Best WL: ~0.45 with optuna config + barycentric. Rank ~10. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New technique: identify cells with longest edges (top 20%), move them toward their connected neighbors' centroid, then re-solve with short GD. This breaks local WL minima by relocating problematic cells. Results: 0.4015 avg WL on tests 1-9, 0.0000 overlap. Best single test: 0.3361 (test 7). Approaching leaderboard rank 10 (Valouev at 0.3577). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Multi-scatter: 3 iterations of targeted scatter+reconverge. Each round identifies new high-WL cells and relocates them. WL 0.40 → 0.3842. - Nuclear/SEMF loss: LJ and Bethe-Weizsacker inspired potentials tested. Negligible impact — redundant with existing WL loss. - Updated PROGRESS.md with all experiments (Runs 14-21). Leaderboard: rank ~9-10 (between Valouev 0.3577 and Del Monte 0.3427). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Region-aware pull toward macros hurt WL (0.387 vs 0.372) — disrupts GD positions. 5 scatter iterations saturate at 3 (no improvement after). Best avg WL: 0.3687. Best individual: 0.2592 (test 10), 0.3232 (test 7). Nuclear/SEMF loss experiments documented. Final position: 0.0000 overlap, 0.37 WL, rank ~9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pipeline with fixed-point iteration: each pass does barycentric refinement, targeted scatter, short GD on WL only, then re-legalize. Tracks best WL and reverts if a pass doesn't improve. WL: 0.3695 avg (tests 1-10). Best individual: 0.2620 (test 10). 0.0000 overlap maintained throughout pipeline. Pipeline passes saturate at ~3. Barycentric O(N²) overlap check is the speed bottleneck for large tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Barycentric: momentum (0.7) accumulates velocity across passes, spatial hash for O(1) overlap check instead of O(N) - Legalization: sort cells by macro affinity region before row packing so connected cells stay near their macro - Pipeline passes with fixed-point iteration and best-so-far tracking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ashvin/net_legalize.py: initial implementation of net-aware legalization that scores candidate slots by alpha*displacement + beta*WL_delta. Not yet integrated into solver pipeline — needs planning + testing. Based on Abacus/BonnPlace literature: legalization should minimize displacement AND wirelength delta, not just pack into rows blindly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pipeline: row-packing guarantees zero overlap, then net-aware legalizer tries each cell at barycentric target + gap positions, scoring by alpha*displacement + beta*WL_delta. Reverts if no improvement. Best results: 0.3613 avg WL (tests 1-10), 0.0000 overlap. Test 10: 0.2292 WL (was 0.2592). Test 7: 0.3177 (was 0.3232). Leaderboard position: ~rank 9 (between Valouev 0.3577 and Paleja 0.3311). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ashvin/detailed.py: post-legalization WL optimization via: - Pair swap: swap same-height neighbors if WL improves - Reinsertion: move worst-WL cells to barycentric target gaps Capped at N<=300 due to O(N*bins) swap evaluation cost. Improved WL on small tests: test 7: 0.32→0.31, test 8: 0.34→0.32. Best avg WL: 0.3613 (tests 1-10), 0.0000 overlap. Pipeline: GD → net-aware legalize → repair → barycentric → scatter → [scatter+reconverge]×3 → detailed placement (swaps+reinsertion) Leaderboard: rank ~9 (0.36 WL, between Valouev 0.36 and Paleja 0.33). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Full pipeline results (tests 1-10): 0.0000 overlap, 0.3540 WL. Test 7: 0.3059, Test 10: 0.2292. Just 0.01 behind Del Monte (partcleda#9). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three structural changes targeting root cause (GD→legalization WL damage): 1. Cell inflation (8%): inflate sizes during GD so overlap penalty spreads cells further apart. Deflate before legalization → natural gaps → legalization needs minor corrections only. 2. Anchor loss: after legalization, GD refinement tethered to legal positions via lambda_anchor * ||pos - anchor||^2. Prevents cells from drifting far, so next legalization is a small correction. 3. Topology-preserving legalization: re-center compacted rows at GD centroid instead of always pushing rightward. Also adds: - WL-priority legalization (wl_legalize.py) - Row reordering + cross-row reinsertion (global_swap.py) - SA refinement (sa_refine.py) — minimal impact, not in default pipeline - Optuna v2 config (best_config_v2.json) - Multistart with wl_priority + spectral variants Results: avg WL 0.368 → 0.358 (2.7% improvement, all 9 tests improved). With multistart, test 3 reaches 0.324 (22% improvement via spectral init). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New swap_engine.py with two move types: - Within-row swap: exchange cell ordering, recompact (always legal) - Cross-row reinsertion: remove from source row, insert near barycentric target in destination row Test 1 improves 0.387→0.369 (+4.8%) from cross-row moves. Tests 2-9 mixed (slight regressions on some from within-row swaps). Also reverts topology-preserving legalization (caused regression). Keeps cell inflation + anchor loss which give consistent improvement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Plan: interleaved legalize-GD → legalization-aware GD → constructive placement Visual analysis shows Abacus fails when GD clusters need compaction (test 3) but wins when preserving GD neighborhoods (test 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Interleaved legalize-GD tested and rejected: -0.2% to -4.8% on all tests. Mid-training legalization disrupts Adam momentum. The current pipeline (full GD → legalize → anchored-GD-polish × 5) is already the right structure. Saved constructive placement plan (island clustering approach): form legal blocks → promote to macro units → coarse/fine placement with LR schedule Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Row penalty (sin²πy ramped in main GD loop) helps T2 +6.1% but hurts T1 -5%, T3 -1.9%. Same pattern as all GD modifications: helps some tests, hurts others, net flat. GD→legalize architecture has ceiling ~0.35-0.36. Moving to Step 3: constructive placement (island clustering). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Instead of replacing GD, use island clustering to create connectivity-aware initial positions that feed into the existing GD pipeline. Island init wins tests 1 (0.400) and 2 (0.313, best ever). Spectral wins test 3 (0.311, best ever). Greedy wins tests 4-5. Multistart picks best per test automatically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Added force-directed init (iterative neighbor averaging) and sequential placement (degree-ordered, near placed neighbors). Compared all 4: random 0.371 > force_dir 0.391 > sequential 0.400 > spectral 0.428 GD is robust to random init. Connectivity-aware inits cluster too tightly, making overlap harder. Init is NOT the bottleneck. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…cks it Rewrote Abacus to use barycentric WL targets instead of displacement. Raw comparison: beats greedy 7/9 tests by 1-4%. Full pipeline: regresses because pipeline passes are co-adapted to greedy. Reverted to greedy-only. Logged findings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Place cells one-by-one at WL-optimal positions (most-connected first), then swap refinement. No GD, no legalization, no overlap loss. Wins tests 1,3,4 (WL 10-12% better than GD pipeline). Loses on larger tests — swap engine needs more iterations. Residual overlap 3-40% — compaction needs work. Fast runtime (3-42s vs 30-300s for GD pipeline). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Project cells to nearest macro boundary when overlap detected. Macro overlaps reduced from 5-16 pairs to 1-6. WL beats GD pipeline on tests 1,4,9 without legalization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Precompute macro-blocked x-intervals per row before placing any cell. All cell placements use best_legal_x() to find nearest legal position. Macro legalization runs first to separate overlapping macros. compact_row skips blocked intervals when pushing cells right. Result: 0.0000 overlap on all 9 tests. WL avg 0.413 (worse than GD pipeline 0.358) because macro legalization pushes macros too far apart. Need better macro positioning next. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Constructive v2 now achieves 0.0000 overlap on all 9 tests. Macro gaps + blocked interval skipping in compact_row. Spatially-aware placement order (cells sorted by anchor macro position). WL avg 0.447 (worse than GD 0.358) — macro gaps push cells far from optimal. T2 beats GD (0.331 vs 0.338). Swap engine needs more iterations and better moves to recover WL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Reduced macro gaps (base 0.5, per-shared 0.3) - Added within-row swaps back with proper both-cell WL evaluation - Track swapped pairs to prevent oscillation - Added macro legalization before blocked interval computation Zero overlap on 7/9 tests. T2 beats GD (0.326 vs 0.338). Avg WL 0.419 vs GD 0.358 — macro placement still too spread. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tests - Macro gaps sized by shared cell width / rows spanned - Bidirectional compact: right sweep resolves overlaps, left sweep pulls cells back toward targets. Distributes displacement symmetrically. - Deferred compaction: place all cells first, compact once at end - Zero overlap on ALL 9 tests - T2 best ever: 0.323 (beats GD pipeline 0.338) - Avg WL 0.417 vs GD 0.358 — gap from cell distribution, not overlaps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Phase 1: iterative barycentric averaging (20 iters, 0.7 damping) Places all cells at connectivity-optimal positions (overlapping) Phase 2: snap to rows, spread via bidirectional compaction Minimal displacement from targets Dropped macro gaps (were hurting, not helping). Zero overlap all 9 tests. Avg WL 0.406 vs GD 0.358. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

BFS places cells in connectivity order from macros but has ordering dependency. Iterative averaging optimizes all connections simultaneously. Reverted to averaging. Logged BFS results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mple Plots confirm all 3 phase 1 variants (avg, BFS, BFS+avg) produce identical cell clusters. The averaging dominates. Phase 1 is not the bottleneck — phase 2 (spreading overlapping cluster into rows) is where WL is lost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When cells overlap, try moving less-connected cell to adjacent row first. Only push right if no room in adjacent rows. Distributes cells across more rows, reducing cascade pushes. Improves 6/9 tests. T5 +0.010, T7 +0.013. Zero overlap on 8/9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

GD for phase 1 (optimal overlapping positions) + constructive phase 2 (WL-aware spreading). Zero overlap all tests. T2=0.315 is best ever on any test. But avg 0.402 vs GD pipeline 0.358 — spreading adds 0.15 WL vs legalization's 0.11. The constructive spreading is worse than greedy legalization at preserving WL. Need fundamentally better spreading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Stripped cross-row moves from spreading. Within-row compact gives identical results (0.406 constr, 0.403 hybrid). Confirms y-assignment is the bottleneck — cells cluster in too few rows after averaging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move cells from overloaded rows to adjacent ones to reduce x-compaction. Result: avg unchanged (0.406 constr, 0.403 hybrid). The y-displacement from cross-row moves cancels the x-compaction benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Distribute cells across rows with capacity limits instead of all going to nearest row. Soft repulsion tested (zero effect — most nearby cells ARE neighbors). Load balancing helps T1/T3/T6/T9 but hurts T5/T8. Spreading improvements are at diminishing returns. The 0.405 avg with zero overlap is the constructive baseline. Next: swap engine improvements or completely different spreading approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace defaultdict + nested Python loops with sort-based binning, torch.unique_consecutive for bin boundaries, and vectorized meshgrid for within-bin pair generation. Correctness verified on T4/T9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

T5: 52s->26s (2x), T6: 126s->51s (2.5x), T9: 587s->341s (1.7x). WL unchanged, zero overlap maintained. Torch sort-based binning replaces Python defaultdict loops in pair generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Brute-force O(N^2) pairwise distance check using torch broadcasting for std cells. No Python loops at all for N<=2000. Sweepline fallback for larger N. T4: 32s->28s. Correctness verified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ashvin-verma and others added 30 commits March 21, 2026 14:48

Add project setup: PLAN.md, uv config, and ashvin/ working directory

8142f03

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update best config from optuna v3: WL 0.4091 on all tests

e87c03a

100 trials, best trial 97. Key: higher lambda_wl (3.58), warmup_cosine LR, low overlap start (1.23), soft beta (2.03). WL improved 0.45 → 0.41. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New best: 0.3540 avg WL, rank 9. Detailed placement working.

e3de003

Full pipeline results (tests 1-10): 0.0000 overlap, 0.3540 WL. Test 7: 0.3059, Test 10: 0.2292. Just 0.01 behind Del Monte (partcleda#9). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ashvin-verma and others added 15 commits March 23, 2026 23:45

Final leaderboard submission: projected GD shelf legalizer

25bc343

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leaderboard submission - Ashvin Verma#59

Leaderboard submission - Ashvin Verma#59
ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
ashvin-verma:main

ashvin-verma commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ashvin-verma commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant