Written in collaboration with cloud partner, Microsoft Azure.
Crash simulations save lives and enable auto manufacturers to avoid expensive physical crash testing. The popular Altair Radioss™ solver performs structural analysis for highly non-linear, compute-intensive problems including crashworthiness, airbag performance, and multiphysics with complex variable sets. Vehicle makers use Radioss to fine-tune their designs, and they rely on fast performance to create more designs in less time and get products to market faster.
Crash Simulation in the Cloud
As demand for high-performance computing (HPC) increases, cloud has become a popular alternative to in-house computing resources. Altair recently collaborated with Microsoft to achieve excellent performance results with Radioss in Microsoft Azure. Radioss scaled up to 64 nodes in Azure without degraded performance — results comparable to those achieved with an on-premises system.
Cloud is an attractive option for customers with limited in-house resources. With cloud computing resources like Azure, there’s theoretically no limit in terms of application scalability and potential job size. Users with time constraints can access more nodes in the cloud to get faster results, and organizations with on-premises data centers can burst to the cloud to run more models and get work done more quickly.
Radioss in Azure provides users with access to the latest Intel processors and high-speed InfiniBand™ interconnect for large jobs and time-critical workloads, so companies can experiment without making large upfront capital commitments. Users can run jobs that require up to 13,200 cores on the Azure HC series. These virtual machines (VMs) feature powerful Intel® Xeon® Platinum processors with a 100-gigabit InfiniBand interconnect for bare-metal performance rivaling some of the most powerful supercomputers.
The cloud architecture used for the benchmark we ran was Azure HC44rs instances optimized for HPC. The node configuration was Intel Xeon Platinum 8168 processors with 44 cores at 2.7 GHz per instance, plus 8 GB of memory per code (352 GB per instance). The system also included fast local storage (700 GB SSD).
To optimize the Radioss setup for best performance, we emphasized the importance of using local SSD storage to avoid I/O bottlenecks, and we used Hybrid MPI and OpenMP to maintain good scaling. Pure MPI offers good performance up to 16 instances, but hybrid has proved more effective for 32- and 64-node instances with fewer than 10,000 elements per core.
Radioss is a highly hybrid parallel code with MPI domain decomposition, OpenMP parallelization, and AVX512 vectorization. Enhanced performance allows high efficiency on large HPC clusters and enables MPI tuning and OpenMP setup. Radioss is a robust solver, offering precise repeatability in parallel, plus double precision (default) or extended single precision.
The benchmark we ran was a Ford Taurus frontal impact simulation at 50 kilometers per hour, using a refined model with 10 million elements and a 2.5 mm mesh size. The full run simulation time was 120 ms (2 ms for scalability tests).
The “Performance Test” graph below shows scalability test results from one instance (44 cores) to 64 instances (2,816 cores), performed for a reduced simulation time of 2ms. The elapsed time — central processing unit (CPU) time required to complete the 2ms simulation — decreased from 3,535s using a single instance down to 163s using 64 instances, nearly a 22X reduction.
For this model size, speedup remains quasi-linear up to 16 instances (704 cores). At 32 and 64 instances, efficiency decreases as the local number of elements to compute per CPU core drops to less than 10,000, a sweet spot to maintain enough parallelism on high-end CPUs. This is where hybrid parallelization, based on MPI and OpenMP, helps to further decrease time-to-solution.
The “Speedup” graph below compares the scalability curve between Azure and some reference results obtained on a best-in-class supercomputer. There is almost no difference, meaning the level of scalability achieved on Azure cloud is as good as best bare-metal reference.
“Microsoft is proud to partner with Altair to deliver Radioss simulations on Microsoft Azure. Testing on the Azure HC-series Virtual Machines shows Azure delivers performance and scaling efficiency that is highly competitive with leading on-premises HPC environments. This reflects Azure’s ongoing commitment to delivering the very best HPC experience to users of Altair products and the HPC community at-large,” said Evan Burness, principal program manager, Azure HPC.
The Future of Agile Computing
With Radioss running efficiently on Azure — and scalability at increasingly high node counts comparable to that of a bare-metal cluster — users can make the most of cloud technology and get results as quickly as they can with on-premises resources.
Azure provides performance and flexibility for agile computing, and users can reduce deployment complexity with the Altair PBS Works™ suite, including remote visualization tools. The superior scalability we found with Radioss allows drastically decreased time-to-solution and increases the pace for product innovation running large-scale crash simulation analysis and design optimization in Azure. It also allows businesses to shift budget expenditure from capital expenses (CapEx) to operating expenses (OpEx).
With performance and scalability in the cloud, going from idea to decision to execution becomes accelerated, faster than would otherwise be possible for many on-premises sites. With Microsoft’s recent announcement of Azure HBv2-series virtual machines, capable of up to four double-precision and eight single-precision teraflops, we expect to see even more impressive application performance in the cloud.
Radioss is available as a part of the solver stack in the Altair HyperWorks™ Unlimited Virtual Appliance on Azure. To register for a trial of this solver on the appliance, click here.