Companies that aim to score big over the long term with big data must do two very different things well. They must find interesting, novel, and useful insights about the real world in the data. And they must turn those insights into products and services, and deliver those products and services at a profit.
While the two goals are mutually reinforcing, companies actually require two distinct departments. To succeed at the first, companies should set up and manage a "data laboratory," staffed with data scientists, who question everything; a loose structure that promotes collaboration; a longer-term focus; and a culture that values creativity and the pursuit of "deeper understanding" above all else. Think here of the great Industrial Age labs, such as Bell Laboratories, IBM Research, Xerox Parc, and their smaller-scale brethren in industry after industry.
For the second, companies should set up and manage a "data factory," staffed by process engineers and others with deep technical skills who "get the job done"; a tight structure that promotes consistency, scale, and decreasing unit cost; a shorter-term focus; and a culture that values quality and revenue above all else. Think here of the manufacturing counterparts of the labs referenced above.
Companies must not confuse the separate roles. But in the digital world, this is all too easy. To understand the difference, consider software. In their search for new insights, data scientists write enormous quantities of code. But it is not designed to meet commercial standards for scalability, security, and stability. You create and support commercial-grade code in the factory.
The laboratory. To succeed with the data lab, companies must create an open, questioning, collaborative environment. They must nurture a critical mass of data scientists and provide them access to lots of data, state-of-the-art tools, and time to dream up and work through hundreds of hypotheses — most of which will not yield insight. But they should have the opportunity to hone the ones that do. They must build a management team that can point data scientists in fruitful directions (perhaps "herd them" is more apt) and assemble them in highly talented, diverse teams. Finally, management must learn to tolerate risk, while at the same time deliver a steady stream of insights that improve existing products and services; an occasional insight that leads to a new product or service; and, if you manage the data lab well and are lucky, a fundamental insight that reshapes a sector — or creates a new one.
We want to doubly emphasize these points because promises of just the opposite are so loud. The many claims for the simplicity of extracting business insight from raw data puts us in mind of the famous Sidney Harris cartoon: "... and then a miracle occurs." Make sure you ask your data scientists "to be more explicit here" before committing big dollars.
The factory. The work of creating a product or service from an insight, figuring out how to deliver and support it, scaling up to do so, dealing with special cases and mistakes, and doing so at profit is beyond the scope of the lab. It calls for a sense of urgency; discipline and coordination; project plans and schedules; and higher levels of automation and repeatability. The work requires many more people with a wider variety of skill sets, a more rigid environment, and different sorts of metrics. While one may use revenue from new products and patent applications to run the lab, they might use total revenue, quality, and unit cost to run the factory. In many respects, the polar opposites of the lab. We use the term "factory" to make the distinction clear.
To be clear, creativity and experimentation are important in the factory, but you must not expect more than incremental thinking and production-oriented solutions.
It's important to make sure that the lab and factory communicate. We use the metaphor of the D4 (data, discovery, delivery, dollars) process to draw attention to the end-to-end thinking and broad communications required. The first two D's, data and discovery, are the purview of the lab, and the fourth, dollars, is the purview of the factory. But the third, delivery, guarantees the communications needed to ensure lab discoveries make their way into products and services. In other words, that the lab and factory understand one another.
There are two keys to success here. First, appreciate the lab and the factory for their respective strengths. Both also have weaknesses, but the overarching goal must be building broad, deep strength in both.
And don't push the lab and factory analogy too far. In the Industrial Age, the lab and factory were housed in separate facilities (though often on the same campus). Doing so for data may be helpful, but that is not the point! Indeed, data is softer and changes more rapidly than the Industrial Age's raw materials, so if anything, the lab and factory must collaborate more closely.
In the world of big data, you improve your odds for the big result if you can generate big ideas and deliver in a big way. Build your lab and your factory in parallel.