It is clear that a new age is upon us. Evidence-based decision-making (aka Big Data) is not just the latest fad, it's the future of how we are going to guide and grow business. But let's be very clear: There is a huge distinction to be made between "evidence" and "data." The former is the end game for understanding where your business has been and where it needs to go. The latter is the instrument that lets us get to that end game. Data itself isn't the solution. It's just part of the path to that solution.
The confusion here is understandable. In an effort to move from the Wild West world of shoot-from-the-hip decision making to a more evidence-based model, companies realized that they would need data. As a result, organizations started metering and monitoring every aspect of their businesses. Sales, manufacturing, shipping, costs and whatever else could be captured were all tracked and turned into well-controlled (or not so well-controlled) data.
I would argue that what you want and what you need is to turn that data into a story. A story explains the data rather than just exposing it or displaying it. A narrative that gives you context to today's numbers by exploring the trends and comparisons that you need in order to make sense of it all. The belief that Artificial Intelligence can support the generation of natural language reporting from data is what drove me to help found our company, Narrative Science. I fundamentally believe that a machine can tackle and succeed at freeing insight from data to provide the last mile in making big data useful, and this belief was the driver in building out a technology platform that makes it real.
It may well be the case that you already have someone who looks at the data, builds the queries, interprets the results and writes up the report. But this is time-consuming and labor-intensive. It doesn't scale. And, given all of the time and money that was put into gathering and managing that data, why stick with non-scalable and expensive ways to perform data interpretation and craft communications?
If we're going to really capitalize on Big Data, we need get to human insight at machine scale. We will need systems that not only perform data analysis, but then also communicate the results that they find in a clear, concise narrative form.
For the most part, we know what we want out of the data. We know what analysis needs to be run, what correlations need to be found and what comparisons need to be made. By taking what we know and putting it into the hands of an automated system that can do all of this and then explain it to us in human terms, or natural language, we can achieve the effectiveness and scale of insight from our data that was always its promise but, until now, has not been delivered. By embracing the power of the machine, we can automatically generate stories from the data that bridge the gap between numbers and knowing.
In order to do this, your starting point has to be the story (or the communication) itself. It defines your information needs. In turn this defines the kinds of analysis that need to be performed on the facts at hand. Finally, the required facts define how you are going to derive these elements of information from the data you have. It is important to note that the starting point for how to think about this problem is the story and it communication goals, not the data. The data is purely instrumental to the communication you want to support. Of course, once configured, the system actually runs in the other direction, from the bottom up against new data as it arrives.
The overall flow goes from data to fact to angle to story to language. The language is the only applied after the system actually figures out what it is going to say.
Here's an example. Imagine, for a moment, that you run an organization with multiple restaurant outlets and you have amassed point-of-sale data for each of your franchisees, but none of them are using that data because they just don't get what they need from it. They need insight as to how their stores are doing and what they should do next. You need to give each of them a report that actually explains how they are doing in comparison to themselves over time, how they might compare to other restaurants, and where there might be shortfalls. This defines the communication that then defines that flow of analysis back to the data level. Graphically, this becomes:
My favorite piece is the last element that our system (called Quill) generates: the advisory. Quill looks for high margin items that have sales shortfalls (in comparison to regional cohorts) that are fixable in the near term. The fact that other stores in a cohort are selling an item is evidence that there is not a regional issue at play. And the fact that the gap is not huge means that bridging it is achievable. All of this comes together to let the system says things like this:
Foot Long Hot Dogs were this week's weakest menu item with average daily sales of fewer than 140 units. Bringing the store's daily sales of Hot Dogs up to the same level as the co-op's would add about $566 more profit each month. Over a year, that's an extra $6,828. The store only needs to sell six more units per day to accomplish this.
Of course, each restaurant needs to get the advice that is relevant to it. Which means that Quill needs to generate at scale (once a week for over 12 thousand restaurants) while also remembering what is said the week before so that it doesn't repeat itself. This provides the franchise owners with ability to make decisions driven by the stories and insight that explain their businesses rather by the data alone.
To get scale from data interpretation, we have to embrace the power of the machine to extract and explain the data that it and it alone is in a unique position to analyze and then communicate. With guidance from business, the machine can provide us with the human link between the world of big data and the actual end game we want: a world of evidence-based insight and decision-making. Because the value of big data isn't the data. It's the narrative.