Leaderboard submission - Ashvin Verma#59
Open
ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
Open
Leaderboard submission - Ashvin Verma#59ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
ashvin-verma wants to merge 45 commits intopartcleda:mainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline: annealed softplus overlap loss → density penalty → deterministic legalization → greedy repair. Scalable spatial hash handles 100K+ cells. Key components in ashvin/: - solver.py: single-stage annealed solver (softplus beta 0.1→6.0, lambda ramp) - overlap.py: two-tier spatial hash (macros exhaustive, std cells binned) - density.py: bilinear density penalty - legalize.py: row-based packing with macro repair - repair.py: greedy nudge with brute-force fallback - config.py: preset configs for optuna tuning - run_tests.py: instrumented runner with CSV output - view.py: versioned placement visualizations Results: 0.0000 overlap on tests 1-10 (verified on alternate seeds). WL ~0.51 (leaderboard partcleda#1 is 0.13 — Phase 2 target). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key fixes: - Macro-macro repair pass in legalization (resolves stale position bug) - Obstacle re-checking loop in row packing (catches cascading shifts) - Brute-force repair fallback for small N (catches bin-boundary misses) - Iterative legalize-repair cycles (converges for 100K cells) - Adaptive epoch scaling (200 epochs for N>10K, 500 for N>2K) Test 12 (100K cells): 0.0000 overlap in 721s — previously 0.1491. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added ashvin/wl_optimize.py with gradient WL polish: runs GD on wirelength only (macros frozen), then re-legalizes to maintain zero overlap. 3 cycles with decreasing LR. WL improved 0.5132 → 0.4971 on tests 1-10. Bottleneck identified: row-based legalization adds ~0.05 WL penalty per pass. GD achieves 0.40 WL but legalization snaps it to 0.45+. Need minimal-disturbance legalization or cell swap post-processing for competitive WL (~0.13). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cell swap: swap same-height nearby cells if WL improves without overlap. Gradient polish: GD on WL only → re-legalize, 3 cycles. WL improved: 0.5132 → 0.4912 on tests 1-10. Still 0.0000 overlap. Test 10 swap phase is slow (509s) due to O(N) overlap check per swap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/tune.py: optuna hyperparameter search (30 trials in 3 min) - ashvin/wl_optimize.py: cell swap + gradient WL polish - ashvin/solver.py: LR schedule support (warmup, warmup_cosine, constant) Optuna best config: WL 0.4544 (from 0.4971), 0.0000 overlap on all tests. Key finding: softer beta (2.09), lower LR (0.003), fewer epochs (500) wins. Config saved to ashvin/results/best_config.json. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/legalize.py: legalize_min_disturbance() nudges minimum distance (experimental — row-based still default, more reliable) - ashvin/view.py: edge visualization shows long edges in red, WL in title - ashvin/run_tests.py: --no-wl-polish flag for faster benchmarking - ashvin/tune.py: lr_schedule as tunable param - ashvin/solver.py: lr_schedule support (warmup, warmup_cosine, constant) Min-disturbance was worse on WL (0.57 vs 0.52) due to cascading nudges. Reverted to row-based as default. WL gap (0.45 vs 0.13) is fundamentally about GD quality, not legalization approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ashvin/init_placement.py: spectral (eigenvector) initial placement via graph Laplacian — places connected cells near each other - ashvin/solver.py: solve_multistart() tries random + spectral init, picks best WL result with 0 overlap - ashvin/wl_optimize.py: cell_swap uses spatial hash for O(1) overlap check instead of O(N) — 500x faster on test 10 Multi-start WL: 0.4468 (from 0.4544). Some tests see big gains from spectral init (test 3: 0.63→0.33, test 9: 0.50→0.37), others prefer random. Multi-start picks the best per test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
100 trials, best trial 97. Key: higher lambda_wl (3.58), warmup_cosine LR, low overlap start (1.23), soft beta (2.03). WL improved 0.45 → 0.41. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Barycentric refinement: move cells toward centroid of neighbors, accept if no overlap. ~2% WL improvement, fast, always on. - Scatter solver: explosive scatter + reconverge. Tested scatter factors 1.0-2.0 — doesn't help (disrupted solutions don't find better minima). - Disabled slow cell swap + GD polish by default. Best WL: ~0.45 with optuna config + barycentric. Rank ~10. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New technique: identify cells with longest edges (top 20%), move them toward their connected neighbors' centroid, then re-solve with short GD. This breaks local WL minima by relocating problematic cells. Results: 0.4015 avg WL on tests 1-9, 0.0000 overlap. Best single test: 0.3361 (test 7). Approaching leaderboard rank 10 (Valouev at 0.3577). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Multi-scatter: 3 iterations of targeted scatter+reconverge. Each round identifies new high-WL cells and relocates them. WL 0.40 → 0.3842. - Nuclear/SEMF loss: LJ and Bethe-Weizsacker inspired potentials tested. Negligible impact — redundant with existing WL loss. - Updated PROGRESS.md with all experiments (Runs 14-21). Leaderboard: rank ~9-10 (between Valouev 0.3577 and Del Monte 0.3427). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Region-aware pull toward macros hurt WL (0.387 vs 0.372) — disrupts GD positions. 5 scatter iterations saturate at 3 (no improvement after). Best avg WL: 0.3687. Best individual: 0.2592 (test 10), 0.3232 (test 7). Nuclear/SEMF loss experiments documented. Final position: 0.0000 overlap, 0.37 WL, rank ~9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline with fixed-point iteration: each pass does barycentric refinement, targeted scatter, short GD on WL only, then re-legalize. Tracks best WL and reverts if a pass doesn't improve. WL: 0.3695 avg (tests 1-10). Best individual: 0.2620 (test 10). 0.0000 overlap maintained throughout pipeline. Pipeline passes saturate at ~3. Barycentric O(N²) overlap check is the speed bottleneck for large tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Barycentric: momentum (0.7) accumulates velocity across passes, spatial hash for O(1) overlap check instead of O(N) - Legalization: sort cells by macro affinity region before row packing so connected cells stay near their macro - Pipeline passes with fixed-point iteration and best-so-far tracking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ashvin/net_legalize.py: initial implementation of net-aware legalization that scores candidate slots by alpha*displacement + beta*WL_delta. Not yet integrated into solver pipeline — needs planning + testing. Based on Abacus/BonnPlace literature: legalization should minimize displacement AND wirelength delta, not just pack into rows blindly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline: row-packing guarantees zero overlap, then net-aware legalizer tries each cell at barycentric target + gap positions, scoring by alpha*displacement + beta*WL_delta. Reverts if no improvement. Best results: 0.3613 avg WL (tests 1-10), 0.0000 overlap. Test 10: 0.2292 WL (was 0.2592). Test 7: 0.3177 (was 0.3232). Leaderboard position: ~rank 9 (between Valouev 0.3577 and Paleja 0.3311). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ashvin/detailed.py: post-legalization WL optimization via: - Pair swap: swap same-height neighbors if WL improves - Reinsertion: move worst-WL cells to barycentric target gaps Capped at N<=300 due to O(N*bins) swap evaluation cost. Improved WL on small tests: test 7: 0.32→0.31, test 8: 0.34→0.32. Best avg WL: 0.3613 (tests 1-10), 0.0000 overlap. Pipeline: GD → net-aware legalize → repair → barycentric → scatter → [scatter+reconverge]×3 → detailed placement (swaps+reinsertion) Leaderboard: rank ~9 (0.36 WL, between Valouev 0.36 and Paleja 0.33). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full pipeline results (tests 1-10): 0.0000 overlap, 0.3540 WL. Test 7: 0.3059, Test 10: 0.2292. Just 0.01 behind Del Monte (partcleda#9). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three structural changes targeting root cause (GD→legalization WL damage): 1. Cell inflation (8%): inflate sizes during GD so overlap penalty spreads cells further apart. Deflate before legalization → natural gaps → legalization needs minor corrections only. 2. Anchor loss: after legalization, GD refinement tethered to legal positions via lambda_anchor * ||pos - anchor||^2. Prevents cells from drifting far, so next legalization is a small correction. 3. Topology-preserving legalization: re-center compacted rows at GD centroid instead of always pushing rightward. Also adds: - WL-priority legalization (wl_legalize.py) - Row reordering + cross-row reinsertion (global_swap.py) - SA refinement (sa_refine.py) — minimal impact, not in default pipeline - Optuna v2 config (best_config_v2.json) - Multistart with wl_priority + spectral variants Results: avg WL 0.368 → 0.358 (2.7% improvement, all 9 tests improved). With multistart, test 3 reaches 0.324 (22% improvement via spectral init). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New swap_engine.py with two move types: - Within-row swap: exchange cell ordering, recompact (always legal) - Cross-row reinsertion: remove from source row, insert near barycentric target in destination row Test 1 improves 0.387→0.369 (+4.8%) from cross-row moves. Tests 2-9 mixed (slight regressions on some from within-row swaps). Also reverts topology-preserving legalization (caused regression). Keeps cell inflation + anchor loss which give consistent improvement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Plan: interleaved legalize-GD → legalization-aware GD → constructive placement Visual analysis shows Abacus fails when GD clusters need compaction (test 3) but wins when preserving GD neighborhoods (test 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Interleaved legalize-GD tested and rejected: -0.2% to -4.8% on all tests. Mid-training legalization disrupts Adam momentum. The current pipeline (full GD → legalize → anchored-GD-polish × 5) is already the right structure. Saved constructive placement plan (island clustering approach): form legal blocks → promote to macro units → coarse/fine placement with LR schedule Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Row penalty (sin²πy ramped in main GD loop) helps T2 +6.1% but hurts T1 -5%, T3 -1.9%. Same pattern as all GD modifications: helps some tests, hurts others, net flat. GD→legalize architecture has ceiling ~0.35-0.36. Moving to Step 3: constructive placement (island clustering). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of replacing GD, use island clustering to create connectivity-aware initial positions that feed into the existing GD pipeline. Island init wins tests 1 (0.400) and 2 (0.313, best ever). Spectral wins test 3 (0.311, best ever). Greedy wins tests 4-5. Multistart picks best per test automatically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added force-directed init (iterative neighbor averaging) and sequential placement (degree-ordered, near placed neighbors). Compared all 4: random 0.371 > force_dir 0.391 > sequential 0.400 > spectral 0.428 GD is robust to random init. Connectivity-aware inits cluster too tightly, making overlap harder. Init is NOT the bottleneck. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cks it Rewrote Abacus to use barycentric WL targets instead of displacement. Raw comparison: beats greedy 7/9 tests by 1-4%. Full pipeline: regresses because pipeline passes are co-adapted to greedy. Reverted to greedy-only. Logged findings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Place cells one-by-one at WL-optimal positions (most-connected first), then swap refinement. No GD, no legalization, no overlap loss. Wins tests 1,3,4 (WL 10-12% better than GD pipeline). Loses on larger tests — swap engine needs more iterations. Residual overlap 3-40% — compaction needs work. Fast runtime (3-42s vs 30-300s for GD pipeline). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Project cells to nearest macro boundary when overlap detected. Macro overlaps reduced from 5-16 pairs to 1-6. WL beats GD pipeline on tests 1,4,9 without legalization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Precompute macro-blocked x-intervals per row before placing any cell. All cell placements use best_legal_x() to find nearest legal position. Macro legalization runs first to separate overlapping macros. compact_row skips blocked intervals when pushing cells right. Result: 0.0000 overlap on all 9 tests. WL avg 0.413 (worse than GD pipeline 0.358) because macro legalization pushes macros too far apart. Need better macro positioning next. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Constructive v2 now achieves 0.0000 overlap on all 9 tests. Macro gaps + blocked interval skipping in compact_row. Spatially-aware placement order (cells sorted by anchor macro position). WL avg 0.447 (worse than GD 0.358) — macro gaps push cells far from optimal. T2 beats GD (0.331 vs 0.338). Swap engine needs more iterations and better moves to recover WL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reduced macro gaps (base 0.5, per-shared 0.3) - Added within-row swaps back with proper both-cell WL evaluation - Track swapped pairs to prevent oscillation - Added macro legalization before blocked interval computation Zero overlap on 7/9 tests. T2 beats GD (0.326 vs 0.338). Avg WL 0.419 vs GD 0.358 — macro placement still too spread. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tests - Macro gaps sized by shared cell width / rows spanned - Bidirectional compact: right sweep resolves overlaps, left sweep pulls cells back toward targets. Distributes displacement symmetrically. - Deferred compaction: place all cells first, compact once at end - Zero overlap on ALL 9 tests - T2 best ever: 0.323 (beats GD pipeline 0.338) - Avg WL 0.417 vs GD 0.358 — gap from cell distribution, not overlaps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1: iterative barycentric averaging (20 iters, 0.7 damping) Places all cells at connectivity-optimal positions (overlapping) Phase 2: snap to rows, spread via bidirectional compaction Minimal displacement from targets Dropped macro gaps (were hurting, not helping). Zero overlap all 9 tests. Avg WL 0.406 vs GD 0.358. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BFS places cells in connectivity order from macros but has ordering dependency. Iterative averaging optimizes all connections simultaneously. Reverted to averaging. Logged BFS results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mple Plots confirm all 3 phase 1 variants (avg, BFS, BFS+avg) produce identical cell clusters. The averaging dominates. Phase 1 is not the bottleneck — phase 2 (spreading overlapping cluster into rows) is where WL is lost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When cells overlap, try moving less-connected cell to adjacent row first. Only push right if no room in adjacent rows. Distributes cells across more rows, reducing cascade pushes. Improves 6/9 tests. T5 +0.010, T7 +0.013. Zero overlap on 8/9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GD for phase 1 (optimal overlapping positions) + constructive phase 2 (WL-aware spreading). Zero overlap all tests. T2=0.315 is best ever on any test. But avg 0.402 vs GD pipeline 0.358 — spreading adds 0.15 WL vs legalization's 0.11. The constructive spreading is worse than greedy legalization at preserving WL. Need fundamentally better spreading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stripped cross-row moves from spreading. Within-row compact gives identical results (0.406 constr, 0.403 hybrid). Confirms y-assignment is the bottleneck — cells cluster in too few rows after averaging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move cells from overloaded rows to adjacent ones to reduce x-compaction. Result: avg unchanged (0.406 constr, 0.403 hybrid). The y-displacement from cross-row moves cancels the x-compaction benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Distribute cells across rows with capacity limits instead of all going to nearest row. Soft repulsion tested (zero effect — most nearby cells ARE neighbors). Load balancing helps T1/T3/T6/T9 but hurts T5/T8. Spreading improvements are at diminishing returns. The 0.405 avg with zero overlap is the constructive baseline. Next: swap engine improvements or completely different spreading approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace defaultdict + nested Python loops with sort-based binning, torch.unique_consecutive for bin boundaries, and vectorized meshgrid for within-bin pair generation. Correctness verified on T4/T9. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
T5: 52s->26s (2x), T6: 126s->51s (2.5x), T9: 587s->341s (1.7x). WL unchanged, zero overlap maintained. Torch sort-based binning replaces Python defaultdict loops in pair generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Brute-force O(N^2) pairwise distance check using torch broadcasting for std cells. No Python loops at all for N<=2000. Sweepline fallback for larger N. T4: 32s->28s. Correctness verified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds Ashvin Verma to both the updated and old leaderboard tables.
New leaderboard result:
Old leaderboard comparable result:
Note: Selective projected GD + WL-aware legalizer
Best result CSV: ashvin/results/20260421_124936_selective_projected_shelf_v2_full_suite.csv