Fujitsu Deploys Arm Processors for Efficient HPC

As high-performance computing (HPC) evolves, Arm processors are an increasingly attractive option for HPC workloads. RISC-based Arm processors are affordable, energy efficient, and low latency, with a reduced instruction set that enables speedy performance without demanding a lot of power.

The Arm ecosystem, as well as Arm availability and usability for HPC, has grown over the past several years. Arm-based projects are taking place all over the world, including the Fujitsu “Fugaku” supercomputer at Japan’s RIKEN Center for Computational Science. Arm users can take advantage of a broad variety of commercial and open-source software including HPC-friendly applications, compilers, debug tools, profiling tools, Python packages, math libraries, file systems, workload managers, operating systems, and parallelism tools.

Arm hardware is customized for HPC workloads with additions such as Scalable Vector Extension (SVE) technology, an extension of the Armv8.2-A instruction set for supercomputers. Fujitsu’s A64FX processor is the world’s first to implement SVE.

Fujitsu A64FX Arm-based Processors

Because Fujitsu understands that both memory performance and calculation performance are important in HPC systems, the A64FX offers both. It comes with 48 calculation cores, high memory bandwidth, 512-bit wide SIMD, and theoretical peak performance of 3.3792 teraflops in double-precision floating-point calculations. This results in excellent calculation performance in real-world applications where strong performance in both areas is needed, such as HPC and artificial intelligence (AI).

HPC systems require technology with excellent power efficiency. The A64FX features the latest 7nm process, 2.5D packaging technology, and microarchitecture which produces optimal power consumption from applications. In real-world applications, this offers a strong power-to-performance ratio.

Reliability is a must for large-scale parallel processing systems. The A64FX has approximately 128,400 error check circuits, allowing it to offer mainframe-class RAS for extensive data integrity.

Workload Management with Altair® PBS Professional®

The industry-leading PBS Professional workload manager is included with Fujitsu PRIMEHPC FX700 systems, which utilize the same technology on which the world’s fastest supercomputer, “Fugaku,” is based. Altair is collaborating with Japan’s RIKEN HPC leadership center, where Fugaku is located, to research cloud utilization using Altair PBS Works™ and Altair Radioss™ on the Fugaku Cloud Platform (FCP).

PBS Professional is a fast, powerful workload manager designed to improve productivity, optimize utilization and efficiency, and simplify administration for clusters, clouds, and supercomputers. It automates job scheduling, management, monitoring, and reporting for even the biggest and most complex HPC workloads.

PBS Professional supports Arm-based architectures, and the collaborative research project with RIKEN on Fugaku proposes the iterative approach of porting Radioss on Arm, integrating it with PBS Professional, porting the Altair Access™ user access portal and cloud bursting capabilities, and finally creating a cloud service with Radioss and open-source solvers like OpenFOAM and ParaView.

Running Solvers on Arm

The Radioss structural analysis solver is designed for highly non-linear problems under dynamic loadings. Radioss is used across all industries worldwide to improve the crashworthiness, safety, and manufacturability of structural designs.

Radioss is one of the first commercial applications to be ported on Arm. It’s a highly optimized code that benefits from the latest processor improvements, including large numbers of cores and increased memory bandwidth available on the Fujitsu A64FX Arm CPU. In addition, Radioss is designed to efficiently exploit the vector instructions of Arm SVE and its first implementation on the A64FX.

Its advanced parallelization scheme makes it possible to decrease time-to-solution by increasing the number of nodes used for a single Radioss simulation job. Such parallel performance deeply depends on its ability to use a high-speed interconnect network like the one available on the FX700 and Fugaku.

Fugaku Supercomputer

The RIKEN Center for Computational Science in Kobe, Japan, is home to Fugaku, which ranked #1 on the Top500 in November 2020. Fugaku delivered a High Performance Linpack (HPL) result of 415.5 petaflops, beating out the second-place Summit supercomputer by a factor of 2.8x. It’s the first Arm-powered system to achieve the #1 spot on the Top500. Significantly, it also made the top 10 most energy-efficient systems on the Green500, coming in at #9.

Fugaku is powered by Fujitsu A64FX processors and can achieve peak performance over 1,000 petaflops (1 exaflop) in single-precision and half-precision formats, which are commonly used in applications like AI, machine learning, and deep learning.

The Fugaku system, which is named after Mt. Fuji, is slated to be fully operational and available for science and research in the 2021 fiscal year. The large amount of CPU resources available on Fugaku will enable more accurate and complex simulations and optimization-driven designs.