Skip to content

Regression: Change rand() to mt19937() to have same test results on Linux and MacOS#9578

Open
fredowski wants to merge 3 commits intoThe-OpenROAD-Project:masterfrom
fredowski:randfix
Open

Regression: Change rand() to mt19937() to have same test results on Linux and MacOS#9578
fredowski wants to merge 3 commits intoThe-OpenROAD-Project:masterfrom
fredowski:randfix

Conversation

@fredowski
Copy link
Contributor

rand() is linked from glib on Linux and from some different library on MacOS. The two different versions produce different pseudorandom sequences. This makes the regression results different on Linux and on MacOS. As the "golden" reference files are produced on Linux, the regression fails on MacOS.

This patch changes the rand function in gpl from rand() to mt19937()

I update the reference files accordingly. I do not see how this can have an impact on real functionality. This reduces the number of failed tests for src/... from 60 to 14 on MacOs

rand() produces different sequences on glibc from linux
versus Apple libc. mt19937 produces the same sequence
on all platforms. This affects 26 tests in gpl and 3 tests
in rsz plus the python tests which test the same.
This reduces the regression test failures on MacOS from
60 to 11.

Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>
The new platform deterministic mt19937 random number
generator produces new reference files. This commit
contains the updated golden reference files.

Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>
    The new platform deterministic mt19937 random number
    generator produces different results. This commit
    contains the updated json files for the test/orfs tests

Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>
@gemini-code-assist
Copy link
Contributor

Note

The number of changes in this pull request is too large for Gemini Code Assist to generate a review.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 1, 2026

clang-tidy review says "All clean, LGTM! 👍"

@maliberty
Copy link
Member

There are quite a few failures to address.

@fredowski
Copy link
Contributor Author

There are quite a few failures to address.

Very strange that

//src/pdn/test:pads_black_parrot_flipchip_connect_overpads-tcl_test FAILED in 2.4s

fails on CI Bazel while that one passes on debian 13 x86_64 locally. But i noticed already that that one is "flaky". It also fails on MacOS. The test is pass on local Debian 13 aarch64.

Fail on MacOS:

      NEW metal10 1660 + SHAPE STRIPE ( 895810 98000 ) ( 895810 396400 )
new ->NEW metal10 1660 + SHAPE STRIPE ( 5595810 5604000 ) ( 5595810 5902000 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5600830 ) ( 5902000 5600830 ) 
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5595810 ) ( 5902000 5595810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5585810 ) ( 5902000 5585810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5575810 ) ( 5902000 5575810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5560830 ) ( 5902000 5560830 ) 
      NEW metal10 1660 + SHAPE STRIPE ( 885810 98000 ) ( 885810 396400 )

Golden Reference:

      NEW metal10 1660 + SHAPE STRIPE ( 895810 98000 ) ( 895810 396400 )
      NEW metal10 1660 + SHAPE STRIPE ( 5595810 5604000 ) ( 5595810 5902000 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5595810 ) ( 5902000 5595810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5585810 ) ( 5902000 5585810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5575810 ) ( 5902000 5575810 )
      NEW metal10 1660 + SHAPE STRIPE ( 5603680 5560830 ) ( 5902000 5560830 )
      NEW metal10 1660 + SHAPE STRIPE ( 885810 98000 ) ( 885810 396400 )

It is however unexpected for me that there is a difference between Ubuntu 2204 and Debian 13 both on x86_64. It is the only failing test from the 1212 bazel ones.

But that one passes on Ctest:

1333/1457 Test #1348: pdn.pads_black_parrot_flipchip_connect_overpads.tcl ...........................   Passed    0.42 sec

Then from the ctest regression:

204/1457 Test  #161: odb.replace_hier_mod1.tcl .....................................................***Failed    0.95 sec
=> Set to "Manual" in bazel until ctest difference is resolved???

1013/1457 Test  #551: gpl.incremental02.tcl .........................................................***Failed   26.33 sec

=> Same. Excluded as "Manual" in bazel

1454 - openroad.upf_test.tcl (Failed)                    IntegrationTest tcl openroad log_compare
1455 - openroad.upf_aes.tcl (Failed)                     IntegrationTest tcl openroad log_compare

=> Same. Excluded as "Manual" in bazel

So there seem to be tests that differ between ctest and bazel and are excluded in bazel. As I only tested in bazel, I did not "see" them failing.

The pads_black_parrot_flipchip_connect_overpads is in bazel and ctest.

  • CI CTest Ubuntu 2204: pass
  • CI Bazel: Fail
  • Local bazel test on Debian 13 x86_64: Pass
  • Local bazel test on MacOS: Fail
  • Local bazel test on Debian 13 aarch64: Pass

Strange. Maybe this is also in the category of unexplained differences between ctest and bazel and we exclude that one also from bazel?

The other tests seem to be excluded from bazel regular tests because there are unexplained differences between ctest and bazel.

I look a little bit closer to the tests which are excluded from bazel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants