Weekly Round-Up: CIA Big Data, Unifying Mean/Median/Mode, New Data Startups, and Naked Statistics

Welcome back to the round-up, an overview of the most interesting data science, statistics, and analytics articles of the past week. This week, we have 4 fascinating articles ranging in topics from data startups to statistics lessons. In this week's round-up:

  • CIA Presentation on Big Data
  • Modes, Medians and Means: A Unifying Perspective
  • A Couple New Notable Data Startups
  • Naked Statistics: Stripping the Dread From the Data

CIA Presentation on Big Data

This is a Business Insider article about the presentation made by CIA Chief Technology Officer, Ira "Gus" Hunt, at GigaOM's Structure data conference in New York. The presentation was about how the agency plans to capture, store, and use the vast amounts of data it is able to collect. The article includes some highlights of the talk and a link to Hunt's slides from the presentation. The video and transcript of the entire talk can be found on GigaOM's website here.

Modes, Medians and Means: A Unifying Perspective

This is a post published earlier this week on the blog of John Myles White, co-author of Machine Learning for Hackers, where he tackles the task of explaining the relationships between mean, median, and mode; noting that this particularly important topic is usually excluded from introductory statistics courses. His explanation of the relationships between the three summary statistics comes across as intuitive and very well structured. For those that have a grasp on basic statistics, this post will definitely help you understand things a little deeper.

A Couple New Notable Data Startups

This week, I came across a couple articles about new startups in the data space that should be interesting to watch grow. The first was a TechCrunch article about Fivetran, a company that wants to reinvent spreadsheets so that they can handle the more modern data analysis tasks that have outpaced the functionality of traditional spreadsheets. Fivetran is backed by Paul Graham's startup incubator, Y-Combinator, and the article provides an overview of the problems they are trying to solve and how they are trying to solve them.

The second data startup article was about Wise.io, a company that is trying to provide machine learning as a service to the masses. The article talks about what they're trying to accomplish, where they got the idea from, and some of their sources of revenue (they are bootstrapped and already profitable).

Naked Statistics: Stripping the Dread From the Data

This is an interesting review of the recently released book Naked Statistics by Charles Wheelan on the Economist website. The book aims to strip away the complexity and explain statistics intuitively by using language, examples, and humor that most people can identify with. The review describes some of the specific examples used in the book to illustrate statistical concepts, comments on some of the other ways Wheelan has chosen to deliver the material, and highlights some of the things you will learn from reading the book.

That's it for this week.

