Crash Simulation with Arm and the Catalyst UK Project

*Written in collaboration with HPE, Arm, and Marvell

Industry demand for increasingly accurate automotive simulations is on the rise, and the Armv8 architecture is emerging as a viable alternative to legacy instruction set architectures, such as x86, in the HPC, datacenter, and cloud computing markets. Arm-based CPUs can handle compute-intensive applications and be an attractive choice for energy efficiency and hardware cost optimization.

To better understand the potential of the Arm processor architecture in the high-performance computing (HPC) segment, Altair initiated a collaboration with Arm, Marvell, and HPE to evaluate the performance of Arm-based processors when running Altair HyperWorks™ solvers. The plan was to start by running crash code benchmarks to assess performance on a cluster of HPE Apollo 70 servers which have Marvell® ThunderX2® processors.

Crash simulation is one of the most demanding types of numerical applications, so the Altair Radioss™ structural analysis solver — used to simulate highly non-linear phenomena like car crashes, with additional applications in areas including consumer goods, electronics, and defense — was selected to be the first Altair solver evaluated on Arm.

Simulating Car Crashes with Radioss

Applications like Radioss allow auto manufacturers to model and simulate crash impacts on a supercomputer, using virtual vehicles, instead of creating and destroying costly physical prototypes.  Simulation saves automotive companies time and money, and it allows manufacturers to look at many more design options, leading to broader choice and better-optimized results.

Radioss was one of the first commercial applications to be ported on Arm, with Arm and Altair engineers collaborating to make it happen. It’s a highly optimized code that benefits from the latest processor improvements, including large numbers of cores and increased memory bandwidth. Its advanced parallelization scheme makes it possible to decrease time-to-solution by increasing the number of nodes used for a single Radioss simulation job. Such parallel performance deeply depends on its ability to use a high-speed interconnect network, together with an efficient implementation of the MPI library.

Benchmarking and the Catalyst UK Program

Altair Radioss was benchmarked on a 64-node Bristol cluster as part of the Catalyst UK Program — a program established to accelerate adoption of Arm-based supercomputer applications in the UK, support research into future architectures and software, and expand the Arm software ecosystem for HPC.

Catalyst UK Cluster Overview

The Catalyst UK project is a collaboration between HPE, Arm, SUSE, Marvell, Mellanox, and the University of Bristol, University of Leicester, and University of Edinburgh (EPCC). It includes the development and use of one of the largest Arm-based supercomputing installations worldwide. The project was awarded a 2019 HPCwire Readers’ Choice Award for Best HPC Collaboration between Academia, Government and Industry. The Catalyst UK program is slated to continue its research and collaboration through 2021.

Scalability Results

Results of benchmarking on the project’s HPE Apollo 70 cluster showed that Radioss running on Arm Marvell ThunderX2 processors delivers very strong scalability. The team achieved high levels of efficiency by testing different MPI libraries and setups, running several industrial examples on a multi-node cluster. As illustrated in the figure bellow on the left, for a 10 million element car crash simulation, the strong scaling is studied from 1 node (64 cores) to 48 nodes (3072 cores). On the right, the scalability curve obtained on Arm is compared to reference one, proving comparable or even better efficiency than reference.

Consequently, we can expect good price/performance with Radioss running on Arm. Customers can expect accelerated simulation conditions that contribute to creating designs more quickly and efficiently. These enhanced conditions directly contribute to improving productivity and reliability, plus accelerated time to market in commercial environments.

Radioss Scaling on an Apollo 70 cluster

The Future with the Arm Ecosystem

With porting completed, and validation on the ThunderX2 based platform demonstrating high scalability and efficiency results, Altair Radioss is ready for next-generation Arm hardware. This includes processors like future Marvell ThunderX® devices for turnkey implementation, and the Fujitsu A64FX chip, announced at SC19, which will power future Fujitsu supercomputers as well as future Cray systems.

Altair continues to work with our partners to further assess performance on multi-node clusters.  Radioss is currently available on the Arm architecture as a proof of concept (POC) for customers interested in trying it and working collaboratively with Altair and the Arm community.