

# FLAGSHIP 2020 project: Development of "post-K" and ARM SVE

Mitsuhisa Sato Team Leader of Architecture Development Team

FLAGSHIP 2020 project
RIKEN Advance Institute of Computational Science (AICS)

IWOMP2016, 6th October, 2016



# An Overview of Flagship 2020 project

AICS

 Developing the next Japanese flagship computer, temporarily called "post K"



 Developing a wide range of application codes, to run on the "post K", to solve major social and science issues



The Japanese government selected 9 social & scientific priority issues and their R&D organizations.







### Co-design

I/O network











# **Target Applications' Characteristics**



|          |            | Target Application                                                                          |                                                          |  |
|----------|------------|---------------------------------------------------------------------------------------------|----------------------------------------------------------|--|
|          | Program    | Brief description                                                                           | Co-design                                                |  |
| <b>1</b> | GENESIS    | MD for proteins                                                                             | Collective comm. (all-to-all), Floating point perf (FPP) |  |
| 2        | Genomon    | Genome processing (Genome alignment)                                                        | File I/O, Integer Perf.                                  |  |
| 3        | GAMERA     | Earthquake simulator (FEM in unstructured & structured grid)                                | Comm., Memory bandwidth                                  |  |
| 4        | NICAM+LETK | Weather prediction system using Big data (structured grid stencil & ensemble Kalman filter) | Comm., Memory bandwidth, File I/O, SIMD                  |  |
| (5       | NTChem     | molecular electronic (structure calculation)                                                | Collective comm. (all-to-all, allreduce), FPP, SIMD,     |  |
| 6        | FFB        | Large Eddy Simulation (unstructured grid)                                                   | Comm., Memory bandwidth,                                 |  |
| 7        | RSDFT      | an ab-initio program (density functional theory)                                            | Collective comm. (bcast), FFP                            |  |
| 8        | Adventure  | Computational Mechanics System for Large Scale Analysis and Design (unstructured grid)      | Comm., Memory bandwidth, SIMD                            |  |
| 9        | CCS-QCD    | Lattice QCD simulation (structured grid Monte Carlo)                                        | Comm., Memory bandwidth, Collective comm. (allreduce)    |  |



## Co-design





**Architectural Parameters** 

- #SIMD, SIMD length, #core,
- cache (size and bandwidth)
- memory technologies
- specialized hardware
- Interconnect
- I/O network

- Mutual understanding both
  - computer architecture/system software and applications
- Looking at performance predictions
- Finding out the best solution with constraints, e.g., power consumption, budget, and space





# **R&D Organization**





#### **Communities**

- HPCI Consortium
- PC Cluster Consortium
- OpenHPC

• ..



- Univ. of Tsukuba
- Univ. of Tokyo
- Univ. of Kyoto

# International Collaboration

- DOE-MEXT
- JLESC
- ...

AICS



# An Overview of post K



#### Hardware

- Manycore architecture
- 6D mesh/torus Interconnect
- 3-level hierarchical storage system
  - Silicon Disk
  - Magnetic Disk
  - Storage for archive



#### System Software

MC-kernel: a lightweight Kernel for manycore

- Multi-Kernel: Linux with Light-weight Kernel
- File I/O middleware for 3-level hierarchical storage system and application
- Application-oriented file I/O middleware
- MPI+OpenMP programming environment
- Highly productive programing language and libraries

XcalableMP PGAS language

FPDS DSL



#### What we have done



#### Hardware

Instruction set architecture



#### Continue to design



- Node architecture
- System configuration
- Storage system

#### Software

- OS functional design (done)
- Communication functional design (done)
- File I/O functional design (done)
- Programming languages (under development)
- Mathematical libraries (under development)



#### **Instruction Set Architecture**



#### ARM V8 with HPC Extension SVF

- Fujitsu is a lead partner of ARM HPC extension development
- Detailed features were announced at Hot Chips 28 2016

http://www.hotchips.org/program/ Mon 8/22 Day1 9:45AM GPUs & HPCs "ARMy8-A Next Generation Vector Architecture for HPC" **SVE (Scalable Vector Extension)** 

#### Fujitsu's additional support

- FMA
- Math acceleration primitives
- Inter-core hardware-suppoted barrier
- Sector cache
- Hardware prefetch assist





2016/11/14

#### RIKEN's Action and Research for ARM SVE



- Early assessment for ARM SVE spec.
  - Review of specification of SVE
  - Deployment of ARM-SVE DS-5 tools
- GEM5 processor simulator for ARM SVE
  - Deployment and testing of GEM5 Atomic Model developed in ARM
  - Development of GEM5 O3 Model for "Post-K" processor (under going)
    - Adjustment of parameters and performance with Fujitsu-in-house processor simulator.
- Evaluation and Testing of compilers for ARM SVE (with U. Kyoto)
  - ARM compiler for SVE (based on LLVM) (C and C++)
  - Fujitsu compiler for SVE (Fortran and C, C++)
  - These compilers are still immature. We give several feedbacks by taking a look at code.
  - Performance evaluation with GEM5 O3 (planned)
- Compiler Research on SIMD-vector code-generation for ARM SVE



#### **Concluding Remarks**



- We are very excited and believe that ARM SVE will deliver highperformance and flexible SIMD-vectorization to our "post-K" manycore processor.
- We think establishment of "eco-system" for ARM SVE in high-end HPC area very important, and are willing to do collaborations with partners who are interested in ARM SVE, as will as ARM.
  - Extend "mobile" eco-system of ARM to HPC!!!
- The system software for Post K is being designed and implemented with the levage of international collaborations
  - The system software developed at RIKEN is Open source
  - RIKEN will contribute to OpenHPC project



