SGEMM parallel performance

Benchmark run on 64-core AWS Graviton 2 (Neoverse-N1-based) SoC.

Arm Performance Libraries demonstrates the most consistent performance across a range of problem sizes:

SGEMM (small problems) - Comparing libraries using 64 coresGraviton 2 (Neoverse-N1)
SGEMM - Comparing libraries using 64 cores  Graviton 2 (Neoverse-N1)

Arm PL 20.2.1 vs FFTW 3.3.8 Complex-to-complex double precision 3-d transforms Single core of Graviton 2 (Neoverse-N1)

3D FFT serial performance

Benchmark run on 64-core AWS Graviton 2 (Neoverse-N1-based) SoC.


Sparse matrix-vector multiplication parallel performance

Benchmark run on 56-core Marvell ThunderX2-based SoC. 


Arm PL 20.2.1 libamath performance improvements over GNU libm Elefunt benchmark on Graviton 2 (Neoverse-N1)

Math functions (libamath) performance

Benchmark run on 64-core AWS Graviton 2 (Neoverse-N1-based) SoC.

Note: All  data that is generated using AWS Graviton 2 (Neoverse-N1-based), uses the 'm6g.16xlarge' instance type .