Improving Self-Service Analytics by Harnessing the Data Swarm

NOTE: This post originally appeared on and was written by Author at Dataversity, Jennifer Zaino.

“Self-Service Analytics is great because it empowers business people,” said Frank Moreno, VP of Worldwide Marketing at Altair. But as the trend has taken off, so have disconnected data silos that create challenges in understanding Data Provenance, modifications, and currency. So, as business users worked with reporting or visualization tools such as Tableau, there often was no way to ensure that the data behind their analytics was well-governed. Betting on the results could be a questionable proposition.

What’s needed is “the ability to share and trust data from multiple, disparate sources and put governance around that,” Moreno said. The goal is to eliminate issues so that business users are able to quickly find and access relevant and controlled data that they can have confidence in. Corralling, prepping, and centralizing it in a browser-based data marketplace, letting business users leverage models built by domain experts and governed by Data Stewards, provides a way to do that, he said.

Altair Knowledge Hub solution offers these capabilities. The company explains that Knowledge Hub works in any environment, ingesting data from any source, and exporting it as clean and reformatted to any reporting, Analytics, or BI/visualization tool. It offers team-driven data preparation and a centralized marketplace from which to discover, access, share, and collaborate, using trustworthy data and trusted data models, Moreno said. After all, the people who actually are familiar with the data will be the ones best suited to coordinate it, to help analysts across the business gain insights.

An expert in the financial group, for instance, can designate an official data set, perhaps related to budgets, to raise trust in it and increase its value, creating model files – a collection of data extraction and presentation tables – that let users apply the same settings to a periodic report each time it is generated. The data can be extracted from relational and unstructured data sources.

In Knowledge Hub, data stewards take on the role of curating certain aspects of the data based upon requests. “In this environment you’re blind unless you say that you are only deploying curated data,” Moreno said. If not deploying only curated data, a business forfeits agility. Leveraging Knowledge Hub’s capabilities “creates an organic, grass-roots-up way to selectively do Self-Service MDM.”

Formalizing cross-functional analytics means that a business analyst in one department can take advantage of what domain experts in another have already created and that stewards are curating in order to save time and labor. An analyst can open up a model that is shared and is reusable across the business environment and get to work.

“By putting filtered data in a centralized place and letting people use models created by domain experts, that solves problems” that are created when cross-functional teams draw from different versions of data that don’t match each other for analytics efforts, Moreno explained.

Get Social and Get in the Game

Knowledge Hub also shows that it recognizes that the digital world is a social world. It offers machine learning-based data socialization to self-service data prep and support organization and governance. Users are guided to the data sources that they should view and use for their Analytics projects.

Users can share, like, and subscribe to data objects, and subscribe to other users, too. They can rate items and recommend them to others based on social interactions and usage patterns.

It’s all about creating awareness. “At any given time of day no one knows what anyone else is doing,” Moreno said. “So, there is a lot of data duplication. With a social network, you get notified when a person you are following has built something,” for example. That means there’s no need to rebuild the data set or model yourself.

To take things a step further, Knowledge Hub has also tapped into the idea of gamification. Its scorecard feature uses social scoring so that it’s possible to rate top producers who make an impact on the business. Through this, one of its customers’ employees got greater visibility for the data work he did that saved his company $5 million a year. “Everyone could see how hard he was working. You know who is doing what and when and how they’re doing it,” according to Moreno.

Upcoming features for Knowledge Hub, expected in the fall time frame, should include automatic identification of personally identifiable information (PII), so that it can be redacted or anonymized.

Reach Out to Data Scientists

In January 2018, Altair acquired Data Science platform provider Angoss Software Corp. It brings predictive and prescriptive analytics into its Monarch application that runs on Windows desktops, extending its reach from Excel users all the way up to Data Scientists. Angoss has been called a leader in the Forrester Wave: Predictive Analytics and Machine Learning Solutions, Q1 2017.

Monarch connects to dozens of data sources – relational, reports, PDF files and so on – and automatically extracts information into Analytics-ready rows and columns for individual users. It enables workflow building using quality data atop Monarch models. There’s no need to perform manual steps to create repeatable processes. With the acquisition of Angoss, Altair adds Predictive and Prescriptive Analytics to Monarch’s ability to acquire, cleanse, prepare, and blend reliable data from multiple sources.

Next up is to move from the integration of these products at the desktop level to getting Angoss onto the Knowledge Hub platform, which is expected to happen around mid-year of 2019. “We want to put simple black box algorithms as part of data preparation, too,” Moreno said. “There’s a good symbiotic relationship between use cases and data.”

Another relationship that will come to fruition between Monarch and Knowledge Hub is to bring the former’s PDF report trapping to the latter, which eases data extraction from PDFs. “A big model for us is getting at unstructured data like data in PDF files,” Moreno says. “Being able to do PDF trapping in the browser will bring 100 percent parity between the two solutions.” Accomplishing that parity opens up other opportunities to provide managed and SaaS services.