spark

Fast Data Applications with Spark & Python Workshop on November 8th

Data Community DC and District Data Labs are excited to be hosting a Fast Data Applications with Spark & Python workshop on November 8th  For more info and to sign up, go to http://bit.ly/Zhj0y1.  There’s even an early bird discount if you register before October 17th!

Hadoop has made the world of Big Data possible by providing a framework for distributed computing on economical, commercial off-the-shelf hardware. Hadoop 2.0 implements a distributed file system, HDFS, and a computing framework, YARN, that allows distributed applications to easily harness the power of clustered computing on extremely large data sets. Over the past decade, the primary application framework has been MapReduce - a functional programming paradigm that lends itself extremely well to designing distributed applications, but carries with it a lot of computational overhead.