Earlier this month, DJ Patil, U.S. Chief Data Scientist with the White House Office of Science and Technology Policy, held a Q and A session on ProductHunt.com. We collected our favorite questions into a blog post, which you can read here. We thought it would be interesting to take key questions from the session and answer them. First, Becky Champion, Senior Consulting Software Engineer, LexisNexis responded to the questions. In a follow-up post, Jesse Shaw, Senior Consulting Software Engineer also at LexisNexis, provided his insight.
In this, our final post of the series, David Wheelock, Consulting Software Engineer, also with LexisNexis, provides his answers.
Q. Where do you look for emerging trends (e.g. sources, events, websites etc.)?
It’s anywhere, everywhere and everything! We are just starting to come into an era where we don’t just look at the stock market or census data alone. We are starting to be able to toss all this data together into a huge info-salad, finding new correlations between apparently disparate pieces of data. What effect does a storm in Montana do to the market price of oranges? Can behavioral trends in the modification of Wikipedia pages predict political or economic fluctuations? Does a shift in twitter sentiment act as a harbinger of changes in travel behavior? We don’t know these things. Data science is about finding out.
Q. What drives you towards data science as opposed to computer science; where the skill set can often overlap?
I was initially drawn to computer science because of my love of puzzles. In computer science, you have a problem to solve and you develop an algorithm to solve it, which can be exciting and challenging. But then it’s done and it’s on to the next puzzle. The wonderful thing about data science is that it’s really a meta-puzzle. You write an algorithm and it corrals the data in the manner you prescribed, but then the data talks back to you, presenting new challenges. I suppose you could refer to data science as trying to hit a moving target, but really it’s more like a convoluted game of follow-the-leader. The data leads you in new and interesting directions, forcing new perspectives and instigating ideas on new algorithms and correlations.
Q. Is it better to be data driven or data informed?
It’s better to be data driven. We can postulate all we want and interpret all we want. But in the end, the data is the data and it will head in whatever direction it chooses to go (barring any intentional manipulations). Yes, I am anthropomorphizing quite a bit here, but that’s because in some ways, data is a living thing that should be respected as an entity in its own right in much the same way we tend to refer to an ant colony as opposed to a group of individual ants. Being data driven forces one to be intent-agnostic, focussing on what is (or was) rather than what may be.
Q. Can you tell us how a data scientist or a team of data scientists can change or affect a nation’s path?
Well, I believe we’re already seeing it. The prominent statistician Nate Silver has an amazing track record of calling things like political races (although he may be having some difficulty with the current presidential one). The degree of respect he commands when it comes to identifying trends and correlations puts him, I believe, in a position of power and influence. At that level of visibility, everything he touches is subject to the observer effect. If people believe his analysis and predictions, that will have a strengthening effect, improving the chances that the outcome will match those predictions.