Interview with Dr Frank Säuberlich, Director Data Science & Data Innovation Teradata
BY SANDY STRASSER
(Published in The Produktkulturmagazin issue 2 2017)
Companies that opt for big data in terms of their business have long faced a huge challenge: for one, they have to ask themselves how to make proper and effective use of big data. Predictive analytics is currently one of the most important trends in big data. But how does predictive analytics differ from business intelligence or business analytics? As Director Data Science & Data Innovation, Dr Frank Säuberlich is in charge of the data science & data innovation department at Teradata; he sheds light on this question for us.
Dr Säuberlich, predictive analytics has long been a major trend in the field of big data. What does this term stand for?
It stands for learning from the past as a way of being able to predict future events. This includes, for instance, how much of a product will be sold next week, which of my customers are most likely to switch to the competition (“churn”) or which of my trains has a high likelihood of breaking down, so that I can prevent this by making repairs the evening before, etc.
How does predictive analytics differ from business intelligence or business analytics? What parallels are there, on the other hand, to data mining?
Business intelligence, as well as what is known as “reporting”, deals with the past: how many of my products have I sold in the past month? Which of my customers switched to the competition last month? This is important information that can be used to derive significant findings. Yet it says nothing about why something happened, or what will happen in future.
Predictive modelling, on the other hand, looks forward – as the name suggests. It can be used to predict what’s going to happen. Algorithms or relevant models also furnish insights into why something happened in the past, for instance which customer characteristics distinguished “emigrants” from loyal customers in the past? This is the information that is used in the predictive model to identify future emigrants.
Many of the algorithms used in data mining come from the field of predictive modelling. But there are also other models used that are not applied for predictive purposes, such as clustering to identify data segments for customers who behave similarly. Association analyses, so to speak, that show which products are often purchased together.
There are many other terms that are used to describe these topics, and they often mean the same thing or at least something similar. Buzz words such as “machine learning”, “artificial intelligence” and “data science”, for example. Predictive analytics plays an important role in all of these areas. This makes it clear that the topic of predictive analytics is quite extensive. So we are constantly working on a blog series that provides more insight into these topics, step by step.
What are the advantages, as well as the challenges, when a company begins working with predictive analytics?
The benefits are clear: the classic and ever-important BI/reporting is supplemented through the addition of a new dimension: looking forward or into the future. This allows for new application areas of analytics, and it holds great potential to improve ongoing business. Cost-cutting and revenue increases, as well as process optimisation, are essential points in this regard.
The biggest challenge for companies that want to use predictive analytics for the first time is often the simple question: where do you start? Here, it is helpful to identify a first application that can benefit from predictive analytics. Then the aim is to try it out. The idea is: “think big, but start small!”, because there are often many steps involved in not only trying out predictive analytics, but also actually implementing it successfully. Data from the past needs to integrated and cleaned, and initial models require testing – and the predictive quality must also be measured and improved. Before doing this, you must already have devoted some thought to the ways in which you want to use such a model, if it is to ultimately work in the end. This can have many facets: from drawing up a simple list of the customers most likely to switch, to complex, real-time scoring of sensor data of machinery in order to predict breakdowns in real time.
What possibilities for use are there?
The possibilities are enormous. They range from revenue forecasts in retail to predictive maintenance of machinery and vehicles, to detecting or preventing fraud.
What tools do companies have available to them, not only to systematically evaluate and present data but to respond dynamically and automatically as well?
The market for tools such as these is vast and murky. Many companies face a huge challenge in this regard. The following applies: look before you leap. Organisations should look very closely to see which providers there are, enquire in depth about their references, consult analyst’s reports, and not simply base their decisions on a pretty PowerPoint presentation or a quotation for a purportedly good price. Basically, one must distinguish between tools with which a company can apply advanced or predictive analytics on the one hand, and BI reporting tools on the other. The latter have gained added functionalities in recent years, permitting not only static reporting but also interactive reporting, visualisation and even analyses. But it is still not predictive analytics.
Ultimately, the same rule applies to all tools: the algorithms, reports and interactive analyses are only as good as the data on which they are based. That is why these data are of enormous relevance. There must be careful consideration devoted to data integration, its preparation and implementation, etc. It is also advisable to discuss these matters across different departments. A well set-up data warehouse represents the ideal basis in this regard. In the context of big data, however, this is often not enough, because even “unstructured” data such as texts, image data or web log files have an increasingly important role to play. The focus then shifts to the architecture of all database systems, making analyses of all data future-proof.
What applications are there for integrating predictive analytics into planning processes?
“Predictions” have always played a major role in the area of planning. Here, however, there is often no use being made of “data-driven” predictions using predictive analytics; instead “gut decisions” are made, or bodies of experts consulted. Yacht sales provide an example of just such an application. In the past, the concern was to predict the residual values of leasing returns. One might consider this a rather “easy” case. But in fact this is a complex and above all business-relevant subject area. Such forecasts of residual value are often made on the basis of monthly planning meetings with in-house experts and a variety of Excel sheets, in the interest of finally arriving at forecasts of the residual value of individual yacht models.
Our customer can display its forecast of residual value in an app created for this purpose. This is a purely data-driven prediction of residual values based on detailed past data (such as information on pricing of certain models or combinations of certain leasing options, etc.). The result: the process for determining residual value is much faster, and predictive analytics is now included in core process planning. The knowledge of experts naturally still has its place in planning, and meetings will also continue to be held. Ultimately, it is the way these elements are combined that makes a decisive difference.
What positive effects does all of this have on operational and strategic decision-making within a company?
It has a huge influence. As mentioned above, predictive analytics opens up a new dimension of the information available to decision-makers. In addition to the backward-looking BI/reporting point of view (how many products of a certain category have I sold in the past month?), it mainly involves a look towards the future (how many will we sell next month?) This, in turn, directly affects operational decisions (how much of the product must I keep on hand in the shop, how many employees are needed?, etc). This also affects strategic decisions, of course (should I adapt my product portfolio?).
Given this, how can such an approach influence profitability or the competitive situation?
I am convinced that in future companies will remain competitive only if they build infrastructure that permits evaluation of data critical to business success. In this connection, the use of predictive analytics is one of the decisive factors of success.
What role will past-oriented business indicators play at all in future if more and more attention is devoted to predictive analytics?
I would not consider the two areas mutually exclusive. Past-oriented indicators still have their usefulness and will remain an important part of how companies can understand and control their business. What will change (and is in the course of changing) is the fact of how these metrics are distributed to the right target group within the company. Interactive dashboards are becoming increasingly important, replacing pages and pages of printed reports. These dashboards make ever-increasing use of forward-looking KPIs derived from predictive analytics.
And what’s more: the end users and addressees of the KPIs have greater freedom to undertake analyses of the underlying data on their own, e.g. of changes in the parameters of a report through to simulations or even own analyses of the data. The key question is this: what is the best way of directly and interactively providing decision-makers with the information they need to make important decisions for the company? In this regard, BI and predictive analytics will be certain to move closer together in any case.
DR FRANK SÄUBERLICH
As Director Data Science & Data Innovation, Dr Frank Säuberlich is in charge of the data science & data innovation department at Teradata. His duties include taking the latest market and technology trends, driving these trends, and making them available for clients. His career includes positions at SAS Deutschland as Senior Technical Consultant, and as Regional Manager Customer Analytics in the field of consulting with Urban Sciences International. He has been with Teradata International since 2012 as Expert for Advanced Analytics and Data Science, and has since been appointed Director Data Science (International).
Picture credits © Markus Spiske/Unsplash