On Tuesday, January 29th, nearly 90 academics, professionals, and data science enthusiasts gathered at JHU APL for the kick-off meetup of the new Mid-Maryland Data Science group. With samosas on their plates and sodas in hand, members filled the air with conversations about their work and interests. After their meal, members were ushered into the main auditorium and the presenters took their place at the front.
Greetings and Mission
by Jason Barbour & Matt Motyka
Jason and Matt kicked off the talks with an introduction of the group. Motivated by both growth of data science and the vast opportunities being made available by powerful free tools and open access to data, they described their interest in creating a local group that help grow Maryland data science community. Being software developers with analytic experience, Jason and Matt next described their seven keys to a success analytic: infrastructure, people, data, model, and presentation. Lastly, metrics about the interests and experience of the members was presented.
The Rise of Data Products
by Sean Murphy
With excitement and passion, Sean took the stage to show how now is the Gold Rush for data products. Laying out the definition of a data product, and cycling through several well known examples, Sean explained how these products are able to bring social, financial, or environmental value through the combination of data and algorithms. Consumers want data, and the tools and infrastructure needed to supply this demand are available either freely or extremely low cost. Data scientists are now able to harness this stack of tools to provide the data products that consumers crave. As Sean succinctly stated, it is a great time time to work with data.
The article version of the talk can be found here.
The Variety of Data Scientists
Being a full-fledged data science, Harlan followed up Sean by presenting his research into what the name “data scientists” really means. Using the results of a data scientist survey, Harlan listed several skill groupings that provide a shorthand for the variety of skills that data scientists possess: programming, stats, math, business, and machine learning/big data. Next Harlan, discussed that the diverse backgrounds of data scientists can be more accurately categorized into four types: data businessperson, data creative, data researcher, and data engineer. With this breakdown, Harlan demonstrated that the data scientists community is actually composed of individuals with a variety of interests and skills.
Cloudera Impala - Closing the near real time gap in BIGDATA
A true cyber security evangelist, Wayne Wheeles presented how Cloudera’s Impala, was able to make near real time security analysis a reality. With his years of experience in the field of cyber security, and his prior work utilizing big data technologies, Wayne was given unique access to Cloudera’s latest tool. Through his testing and analysis, he concluded that the Impala tool offered a significant improvement in performance and could become a vital tool in cyber security.
After the last presentation, more than a dozen members joined joined us at nearby Looney’s Pub to end the night with a few beers and snacks. To everyone's surprise, Donald Miner of EMC Greenplum offered to pick-up the tab! You can follow him on Twitter or LinkedIn from this page.
If you missed this first event, don't worry as the next one is coming up on March 14th in Baltimore. Check it out here.