Get the Right Data Scientists Asking the “Wrong” Questions

Wouldn’t it be great to catch the next Bernie Madoff well before his pyramid scheme collapsed around us?

That’s not a rhetorical question. Advances in the field of data science have brought us to the point where it’s reasonable to expect that an ongoing program of fraud could be identified in its early stages by people with access to the right data to cross-reference and query. And more than ever before, organizations and even ordinary citizens have access to massive data sets; they can draw on publicly available information in dispersed domains such as social media, open source projects, government statistics, and even weather patterns.

But data by itself is meaningless. It’s the skill of the data scientist that makes the difference. The best of them allow us to see the data in a set, to visualize relationships between data points, to ferret out insights, to turn expectations topsy-turvy — and ultimately, to solve previously unsolvable questions for businesses.

So, what makes an exceptional data scientist? When I first started practicing what we now call data science, I thought anyone attempting this job had to be classically trained in scientific method, statistics, math, or computer science – which was how I got into the field. I now recognize that while those are important skills, that list is by no means exclusive. Moreover, it’s possible to have all of these, and still not be able to pioneer what can be done with the numbers.

Fundamentally, what sets a great data scientist apart is fierce curiosity – it’s the X factor. You can teach the math and the analytical tools, but not the tenacity to experiment and keep working to arrive at the best question – which is virtually never the one you started out with.

And even that insanely curious data scientist, if he or she insists on working alone, won’t be able to produce the most valuable insights. Those come from high-performing teams combining individuals who are individually curious and naturally creative, but also collaborative in their approach to the art and science of experimentation. A great data science team is like a jazz quartet, where individuals are always riffing off of one another, and each takes the music to a new and unexpected place. In fact, my team actually includes a musician – and also a forestry major – as well as statisticians and computer scientists. The cognitive strengths that enable creative minds to see patterns in Bach fugues or in tree growth rates lend themselves elegantly to seeing patterns in, say, genetic code or disease markers for pharmaceutical effectiveness.

Along with my changing sense of who are the “right people” for data science, I’ve also developed an appreciation for the value of the “wrong questions.” The idea that a team should start off on the wrong foot might sound counterintuitive, but our data science team at Booz Allen spends a lot of time asking, and experimenting with, “wrong” questions in order to get to the better questions that yield solutions for clients.

This happened recently with a large financial system we studied. Our task was to find a way to detect fraud earlier, which would prevent much of it and save our client money. The fraud had manifested itself in hundreds of different ways, but there was so much of it and the fraudsters moved so quickly that we couldn’t keep up with the patterns needed to track it. Working with ten years of data and 400 variables, we were trying to model what “bad” looks like in order to detect it and stop future perpetrators.

So we changed the nature of the question we were asking. Instead of, “How do we model bad?” we asked “What if we modeled good?” And as we found out, modeling what a good person taking compliant actions looks like is a far more effective way to detect and prevent fraud. In practice, that meant going beyond individual transactions to focus on patterns of behavior by people who are, for example, very consistent in terms of where they live and what income they have. In light of “good” behavior patterns, interesting anomalies are easier to detect and take action on. And “bad” behavior and the inconsistencies associated with it (such as a Madoff-style Ponzi scheme) stand out strongly. Starting with this wrong question ultimately enabled us to identify more than $1 billion in massive, widespread fraud for our client.

As companies look to data to solve increasingly complex challenges, they will become ever more reliant on their data scientists’ curiosity, tenacity, and refusal to accept the status quo. Learning to ask – and answer – bigger questions using data science starts with an organization’s openness to starting data experiments, repeatedly asking the “wrong” questions, and learning in fast iterations. Once you begin to ask why your analytics are yielding certain results, you’ll uncover the most relevant question: “How does this help me get to the problem I want to solve?”

The true nature of data science consists of asking a series of questions – and accepting analytic failures, which ultimately lead to the bigger questions, the better insights, and the more valuable decisions. So why not ask a question like: “How can we catch the next Bernie Madoff before his pyramid scheme collapses around us?” It might not turn out to be exactly the right question, but it’s exactly the kind of challenge that gets a great data scientist thinking.

This entry was posted in Leadership. Bookmark the permalink.

Comments are closed.