Data Visualization: Teaching Data Viz

In the past few months Data Community DC (DC2) has brought together a series of great speakers for its visualization program DVDC, and now people are asking for more depth.  We have begun including interactive elements in our traditional lecture style events, breaking up the format to allow people to freely ask questions of the organizers, speakers, and enthusiasts.  We have received positive feedback, but there has also been a steady request for more depth and detail.  As a result, DC2 has recently begun organizing workshops around our personal expertise, our event speakers, and data practitioners in our network.  This naturally begs the question, "How do you teach data visualization to newcomers with little to no coding experience?"

The DVDC events have focused on "data psychology", "visualization languages", and "visualization techniques".  Data psychology is about understanding people and how you can use visualizations, in the right context, in the right sequence, starting with the right data, etc., to best communicate your data insights.  Visualization languages is somewhat self-explanatory, it focuses on the increasingly efficient and easy to use programming languages.  Visualization techniques is all about how can you visually represent the data so it is both pleasing to the eye, suppresses irrelevant nuances, and highlights key features about the data.

This approach has worked well and we've received very positive feedback, but a workshop is not passive learning, you have 3-4 hours to introduce a topic, explore the topic, be creative, regroup, discuss, and review lessons learned, and as I was emphatically told, "We don't want you lecturing for 3-4 hours!"  Data psychology is nice but without something outside the metaphysical world there isn't much to physically play with, and people are pretty creative to begin with.  Focusing on the visualizations by themselves becomes more of an art class as people go wild with their imaginations.  The code behind data visualization is the only way to focus the discussion, the only way to create a "sandbox" in which people can explore the rules, learn what's possible, and find something that brings their imagination to life.

This can not be an introduction to R or Python class as it would take up the entire class; If there is little coding experience we need to make the code obvious and thereby secondary to creating their visualization of interest.  This is easily done by curating a workspace that contains an interesting dataset, which everyone downloads at the beginning of class.  From there we introduce visualization "widgets" the class can easily call from the command line, and can be easily mixed and matched with each other, or input data varied, in order to create something unique for each person.  We can mix and match classic charts, maps, stacked charts, proportional symbol graphs, heat maps, gantt charts, steam charts, arc charts, polar charts, etc., enough to show that there are more ways to combine data than you can shake a stick at, and therefore depends more on what you want to say rather than what's possible.

Of course the holy grail is interactive visualizations, but that requires new languages and more sophisticated passing of variables between front and back end controllers.  Of course a more advanced class is the natural next step.