Skip to content

Wrong output from getarch on Apple M4#5227

Merged
martin-frbg merged 3 commits into
OpenMathLib:developfrom
zanpeeters:develop
Apr 21, 2025
Merged

Wrong output from getarch on Apple M4#5227
martin-frbg merged 3 commits into
OpenMathLib:developfrom
zanpeeters:develop

Conversation

@zanpeeters
Copy link
Copy Markdown

When I run getarch 1 on Apple M4 it reports different values than sysctl hw run from the command line.

Steps

  1. git clone
  2. cd OpenBLAS
  3. mkdir build
  4. cmake -B build
  5. ./build/getarch 1

output:

#define ARMV8
#define HAVE_NEON
#define HAVE_VFPV4
#define VORTEX
#define L1_CODE_SIZE	     1867590060
#define L1_CODE_LINESIZE     1867590060
#define L1_DATA_SIZE	     1867590060
#define L2_SIZE	     1867590060
#define DTB_DEFAULT_ENTRIES  64
#define DTB_SIZE             4096
#define NUM_CORES 10
#define NUM_CORES_LP 2
#define NUM_CORES_HP 2
#define CHAR_CORENAME "VORTEX"

When I run sysctl hw I get:

hw.ncpu: 10
hw.nperflevels: 2
hw.perflevel0.cpusperl2: 4
hw.perflevel1.cpusperl2: 6
hw.cpufamily: 1867590060
hw.physicalcpu_max: 10
hw.l1icachesize: 131072
hw.cachelinesize: 128
hw.l1dcachesize: 65536
hw.l2cachesize: 4194304

According to man sysctlbyname:

sysctlbyname(const char *name, void *oldp, size_t *oldlenp, void *newp, size_t newlen)

The information is copied into the buffer specified by oldp. The size of the buffer is given by the location specified by oldlenp before the call, and that location gives the amount of data copied after a successful call and after a call that returns with the error code ENOMEM.

In other words, length64 in cpuid_arm64.c may have changed after every call to sysctlbyname.

Changes

  1. length64 = sizeof(value64) before every call to sysctlbyname.
  2. Added L1_DATA_LINESIZE (needed in benchmarks)
  3. hw.perflevel0.cpusperl is hw.perflevel0.cpusperl2, at least on my Apple M4.

Result

#define ARMV8
#define HAVE_NEON
#define HAVE_VFPV4
#define VORTEX
#define L1_CODE_SIZE	     131072
#define L1_CODE_LINESIZE     128
#define L1_DATA_LINESIZE     128
#define L1_DATA_SIZE	     65536
#define L2_SIZE	     4194304
#define DTB_DEFAULT_ENTRIES  64
#define DTB_SIZE             4096
#define NUM_CORES 10
#define NUM_CORES_LP 6
#define NUM_CORES_HP 4
#define CHAR_CORENAME "VORTEX"

@martin-frbg martin-frbg added this to the 0.3.30 milestone Apr 21, 2025
@martin-frbg
Copy link
Copy Markdown
Collaborator

Thank you - I appear to have missed that part about the dual use of the length parameter. Luckily this blunder should not have had serious impact on the current ARM64 code (apart from preventing the benchmarks from being built on M4)

@martin-frbg martin-frbg merged commit f5bc97c into OpenMathLib:develop Apr 21, 2025
84 of 86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants