In my last article, “Collecting & Reporting Site-Specific Job Metrics,” I talked about how you could, well, do what the title suggests: collect and report site-specific job metrics with some easily written and implemented hooks. This article will talk about visualizing these newly defined site-specific job metrics by using PBS Analytics.
PBS Analytics is an easy to use job accounting and reporting product that provides PBS Works administrators with advanced analytics to support data-driven planning and decision making. Easily extensible to meet your unique requirements, PBS Analytics consolidates data from multiple PBS Professional servers, providing a global view of HPC usage for chargeback, capacity planning, troubleshooting and project management. While this product has been available for several years, it wasn’t too long ago that we launched a new capability for visualizing custom resources.
A Little Terminology
The PBS Analytics application consists of three components — it is important to understand these, as we will reference them in this article.
The Data Collector has two functions. It first copies the accounting logs from your site’s PBS Professional accounting log location to an intermediate holding area. Secondly, it transports the information stored in the holding area to the machine where the PBS Analytics web application and parser are installed.
The Parser reads the PBS Professional accounting logs, parses the appropriate information from the accounting logs, and then loads this information into a database.
The Web Application is a graphical web-based accounting, analytics and reporting application used by the users and administrators.
First, we need to get the PBS Professional accounting logs copied over to the system where the PBS Analytics web application and parser are installed. If the data collector supports your system running the PBS Server, then this process is easy and will happen automatically after you installed it.
However, if the data collector does not support the system running the PBS Server — as the Cray system I used in my project was not supported — then we have to get a little creative with getting the accounting logs to the system running the parser and web application. You can setup a “nightly” cron-job, that starts ~5mins after midnight, to remote copy (e.g., scp) the PBS Professional accounting logs from the Cray system hosting the PBS Server to a PBS Analytics support platform, then you are good to go. Another option, if you are up for the challenge, is to export the PBS Server PBS_HOME directory and mount it on a PBS Analytics supported platform — then you are just as good, but need to be careful about the usual NFS hiccups.
A little disclaimer so the product team does not get upset with me: PBS Analytics is not officially supported on Cray systems. So, if you are running your PBS Server and Scheduler on the sdb node or an esLogin node, we need implement one of the options I described above. Other than this little nuance, the rest of the article will be applicable to supported PBS Analytics platforms.
Now that we have the accounting logs getting copied to the PBS Analytics system, we need to configure the Parser to understand the new custom resources. You will need to edit the customresources.conf file located in the <INSTALL_DIR>portal/config directory. Below is a snippet of the configuration file I used to report on the custom resources I created in the last article:
I know what you are thinking: “Wow, that looks like a complex configuration file!” I would agree with you, too. But set aside the “com.altair.pbsworks.parser.exporter.event.base.derived” syntax, and the rest of the configuration file is not too bad. You will want to review the PBS Analytics Administrator Guide to better understand the formatting.
Special note: if you had already parsed your accounting logs, you will want to reset the dataset and reparse with the new configuration file. Again, reference the Administrator Guide on how to “Reset the PBS Analytics Dataset.”
Depending on how many accounting records you have, the data collecting and parsing could take a matter of minutes to a few hours.
Now that we have everything parsed, let’s look at what we can visualize, slice and dice, and drill down into accounting records.
In Figure 1, you can see that our new custom resources are selectable and can be dragged and dropped into the different field on the right side of the screen. You can create chart filters, customize your Z- or X-axis, and define which values you want to see on the chart. As you can imagine, the PBS Analytics web application will allow you to create many other chart-types (e.g., pie, scatter plot, line).
In Figure 2, we are comparing the jobs system time (stime) and user time (utime) to the total CPU Walltime (cputime). You might be asking why 2014-04-23 and 2014-04-25 do not have any stime and utime data — this is because I had to disable my integration so the developers could perform their regression tests and they didn’t want my work to cause unnecessary troubleshooting. As if I was going to write bad code – com’on!
In Figure 3, we are illustrating the total energy used (energy_used) by all jobs on a given day. This is just another powerful example of showcasing the metrics that may be important for your site.
PBS Analytics has been available for several years, and similar to PBS Professional it can be extended to support site-specific requirements. With some configuration, you can customize your PBS Analytics instance to visualize your site-specific metrics and use this data for chargeback, capacity planning, troubleshooting and project management. For more information or follow up questions, please feel free to leave a comment below.