Welcome back to the round-up, an overview of the most interesting data science, statistics, and analytics articles of the past week. This week, we have 4 fascinating articles ranging in topics from collective intelligence to correlations. In this week's round-up:
- Future of Engagement: Collective Intelligence
- Big Data Debate: Will Hadoop Become Dominant Platform?
- When Mere Data Isn’t Enough
- Nate Silver Gets Real About Big Data
This is an interesting blog post highlighting collective intelligence in a series about the future of engagement. The post defines collective intelligence, attributes its rise to some broad recent trends, provides some examples of companies that have used it, and explains a little about how it works. It then goes on to describe how different brands are using collective intelligence and some of the things we can expect to see in its future.
Anyone who follows Big Data has certainly heard lots about Hadoop. This InformationWeek article pits two Big Data experts against each other in a debate about whether or not Hadoop will end up being the center of companies' data architectures. One expert says future architectures will revolve around Hadoop while the other expert argues that SQL will need to be the lingua franca for analytics and transactional database applications. Read both sides and then decide for yourself.
Speaking of Hadoop, we're hosting an introductory Hadoop workshop on April 27th. Details here.
This is a GigaOM article about correlations, when they are useful, and when they can be dangerous. The article cautions readers not to read too far into correlations that may not actually be causal (such as the correlation found between intelligence and liking curly fries on Facebook). The article argues that when the stakes are small and correlations are the best data you have, then using them is fine. However, when important decisions hang in the balance, additional information needs to be considered. The article provides some tangible examples of both cases.
This ReadWrite article outlines some of the Big Data views of NY Times blogger and author of The Signal and the Noise, Nate Silver. The article focuses on how having massive amounts of data has the potential to cloud our decisions due to the fact that there tends to be more noise than signal in the data (hence the name of his book). It cites how the more data is available to us about politics or global warming, the more divided we tend to become on those issues. The bottom line is that the data doesn't speak for itself. It is subject to interpretation and the article says that according to Silver, when generating predictions using data we should take probabilities into consideration, be conscious of biases, and try not to predict certainties because we usually get them wrong.
That's it for this week. Make sure to come back next week when we’ll have some more interesting articles! If there's something we missed, feel free to let us know in the comments below.