Upcoming Data Science Workshops

We co-host several weekend workshops throughout the year with our partners at District Data Labs. Check out some of our upcoming offerings below. 

 

Graph Analytics with Python

March 12th from 9am - 5pm
4601 Fairfax Drive
Arlington, VA 22203

Learn how to use Python to construct and analyze a social network, compute cardinality, traverse and query graphs, compute clusters, and create visualizations. In this course we will construct a social network from email communications using Python. We will learn analyses that compute cardinality, as well as traversal and querying techniques on the graph, and even compute clusters to detect community. Besides learning the basics of graph theory, we will also make predictions and create visualizations from our graphs so that we can easily harness social networks in larger data products. 


Data Visualization with R

April 2nd from 9am - 5pm
4601 Fairfax Drive
Arlington, VA 22203

Learn how to demonstrate data in visually appealing ways and effectively communicate your message to viewers. This course will teach students how to use R to create highly customized visualizations suitable for use in print and web media. The course will focus on how to use these tools to create visualizations that incorporate the elements of design and effectively communicate data according to the principles of good design. 

 


Natural Language Processing with Python

April 9th from 9am - 5pm
4601 Fairfax Drive
Arlington, VA 22203

Learn about features of Python's Natural Language Toolkit (NLTK) and how to build a language aware data product using clustering algorithms and LDA analysis. In this course we will begin by exploring NLTK from the view of the corpora that it already comes with, and in this way we will get a feel for the various features and functionality that NLTK has. This will last us the first part of the course. During the second half of the course we will focus on building a language aware data product from a specific corpus - a topic identification and document clustering algorithm from a web crawl of blog sites. The clustering algorithm will use a simple Lesk K-Means clustering to start, and then will improve with an LDA analysis.


Supervised Machine Learning with R

April 30th from 9am - 5pm
4601 Fairfax Drive
Arlington, VA 22203

This course provides an overview of supervised statistical modeling and machine learning and introduces R's capabilities for regression and classification. We will focus on a small subset of algorithms and emphasize out-of-sample evaluation. After this course you will have used several supervised machine learning methods, will have learned how to perform inference with these models, and will understand how to use out-of-sample evaluation methods for your models.