Supervised Machine Learning with R Workshop on April 30th

Data Community DC and District Data Labs are hosting a Supervised Machine Learning with R workshop on Saturday April 30th. Come out and learn about R's capabilities for regression and classification, how to perform inference with these models, and how to use out-of-sample evaluation methods for your models!


R is a powerful language for statistical computing. A prolific user community backs R with with an extensive library of packages. If you can think of it, somebody has already written a library for it. R also has a superb IDE, R Studio, facilitating reproducible research.

This course is for people with some R programming experience. It gives an overview of supervised statistical modeling and machine learning in R. We will focus on a small subset of algorithms and emphasize out-of-sample evaluation.


This course introduces R capabilities for regression and classification. Many machine learning algorithms exist and it is only possible to cover a small subset in a single class. We will focus on:

  •  Linear and logistic regression
  •  Decision tree and SVM classifiers
  •  Training sets and test sets
  •  K-fold cross-validation
  •  Prediction vs. inference


The workshop will cover the following:

  • Setting up an R Studio Project and file structure.
  • Review of R, R Studio
  • CRAN task view: machine learning
  • Training, testing, and k-fold cross validation
  • Decision trees and random forests
  • Support vector machines
  • General linear models, focusing on logistic regression
  • Linear regression models

After this course you will have used several supervised machine learning methods. You will understand how to use out-of-sample evaluation methods for your models. Where possible, you will learn to perform inference with these models.

More Info and Registration