Workload Orchestration is Key for Artificial Intelligence & Machine Learning

Machine learning (ML) and artificial intelligence (AI) are a large part of today’s technology landscape. There is no doubt that they are hot topics, and they will be on the agenda next week at SC19, the premier global supercomputing conference. Organizations are at various stages of applying AI and ML, but many are already making significant impact on business, discovery, and innovation by solving real-world problems.

Altair customers are already utilizing ML in healthcare, retail, government, financial services, and more. It’s useful for a range of tasks including marketing analytics, fraud detection, and population growth analysis. Altair Knowledge Works™ includes solutions for data preparation, data science, predictive analytics, and streaming visualization. It merges historical and real-time data which is used to provide predictive maintenance and the creation of digital twins.

One insurance customer saved $15 million per year by using Knowledge Works to model anomalies and identify fraudulent claims. A retailer improved its marketing response and identified the ideal customers, which resulted in another $15 million win. In addition, a financial customer increased the amount of debt payments it collected by $24 million per year using ML with Knowledge Works.

In addition to the pursuit of using AI and ML to solve real problems, there are two areas of growth in today’s technology climate: using HPC solutions to improve ML and using ML to improve HPC solutions.

Using HPC Solutions to Improve Machine Learning

Training ML models is resource-intensive and requires powerful HPC. Sharp accuracy and high fidelity are mandatory. ML uses patterns to model problems and consumes large amounts of data which leads to an enormous amount of input and output. Workload management is an essentialtechnologyfor organizations as they attempt to provision and manage HPC resources that power their ML and AI initiatives.

Thanks to the advancements of technologies like cloud computing, GPUs, and new ML frameworks, AI and ML are more powerful and accessible than ever to a broad audience. Altair helps customers navigate the convergence of these new technologies.

Our customers use Altair PBS Works™ to submit ML jobs through platforms like TensorFlow, for GPU optimization, which is possible due to our strong partnership with NVIDIA. Customers also orchestrate HPC workloads between containers (like Kubernetes), big data workloads (Spark), and the cloud.

Using Machine Learning to Improve HPC Solutions

Not only does HPC make ML possible, ML can make HPC solutions more effective.

For decades, Altair’s workload management and job scheduling solution has efficiently found the right resources for the job, making it a must-have for HPC users. Using the correct ML models, we help organizations employ millions of pieces of their own data. Looking into how long, how costly, and how accurate jobs have been for a given set of resources and training workload managers and job schedulers to more effectively assign HPC jobs to hardware.

This modeling helps determine the optimal resources to buy including licenses, nodes, memory, and GPUs. This also compares cloud costs and answers job time related user questions. It prescribes the best resource for a job related to efficiency, time to solution, utilization, power use, and cost. It predicts execution times to improve utilization and faults for resilient scheduling and preemptive maintenance. It also forecasts I/O patterns to mitigate congestion at the filesystem level and on networks, and user patterns to proactively provision the correct resources at the optimal times.

Learn More at the Year’s Biggest Supercomputing Conference

Find out more about Altair workload orchestration solutions for ML and AI at SC19 on November 19, 2019 in Denver, CO. Altair PBS Works CTO, Bill Nitzberg, will present “HPC for Artificial Intelligence and Machine Learning for HPC,” at our booth (#1319) at 12 p.m.

View Our SC19 Schedule