Data Visualization DC

Notes on a Meetup

This is a guest post by Catherine Madden (@catmule), a lifelong doodler who realized a few years ago that doodling, sketching, and visual facilitation can be immensely useful in a professional environment. The post consists of her notes from the most recent Data Visualization DC Meetup. Catherine works as the lead designer for the Analytics Visualization Studio at Deloitte Consulting, designing user experiences and visual interfaces for visual analytics prototypes. She prefers Paper by Fifty Three and their Pencil stylus for digital note taking. (Click on the image to open full size.)

Moderating The World IA Data Viz Panel

This weekend was my introduction to moderating an expert panel since switching careers and becoming a data science consultant. The panel was organized by Lisa Seaman of Sapient and consisted of Andrew Turner of Esri, Amy Cesal of Sunlight Foundation, Maureen Linke of USA Today, and Brian Price of USA Today. We had roughly an hour to talk, present information, and engage the audience. You can watch the full panel discussion thanks to the excellent work of Lisa Seaman and the World IA Day organizers, but there's a bit of back-story that I think is interesting.

DataViz-BW-AgencyFB Bold DataViz-Rivers-AgencyFB BoldIn the spring of 2013 Amy Cesal helped create the DVDC logo (seen on the right), so it was nice to have someone I'd already worked with. Similarly, Lisa had attended a few DVDC and asked me to moderate because she'd enjoyed them so much. By itself it's not exactly surprising that Lisa attended some DVDC events and went with who she'd met, but common sense isn't always so common. If you google "Data Viz" or "Data Visualization" and focus on local DC companies, experts, speakers, etc. you'll find some VERY accomplished people, but there's more to why people reach out. You have to know how people work together, and you can only know by meeting them and discussing common interests, which is a tenant of all the DC2 Programs.

Now that the sappy stuff is out of the way, I wanted to share some thoughts on running the panel. I don't know about you, but I fall asleep whenever the moderator simply asks a question and each panelist answers in turn. The first response can be interesting, but each subsequent response builds little on the one before, there's no conversation. This can go on for one, maybe two go-rounds, but any more than that and the moderator is just being lazy, doesn't know the panelists, doesn't know the material, or all of the above. A good conversation builds on each response, and if that drifts away from the original question the moderator can jump in, but resetting too much by effectively re-asking the question is robotic and defeats the purpose of having everyone together in one place.

Heading this potential disaster off at the pass, Lisa scheduled a happy hour, hopefully to give us a little liquid courage and create a natural discourse. I did my homework, read about everyone on the panel, and starting imagining how everyone's expertise and experience overlapped. Accuracy vs communicating information; Managing investigative teams vs design iteration; building industry tools vs focused and elegant interfaces; D3js vs Raphael. The result: a conversation, which is what we want from a panel, isn't it?

Developments with Data Visualization DC

SantaAlcubierreWarpDrive From its inception, Data Visualization DC has been actively searching for new ways to engage its audience and create a more integrated self sustaining community, and be a more valuable part of Data Community DC.  Most recently we've invested some of our sponsorship funds with Event Central to cover our events with great photography and video, and who has also been working with MoDevDC, Action Design DC, and others to create a series of excellent content unique to the Washington DC technical community. Event Central has recently completed the video from the last DVDC event "Visualizing Christmas & Visual Storytelling", which itself is the result of community participation by SynglyphX, Plot.ly, Visual.ly, Developfor, and a few volunteers.

Our goal with the local and at large data visualization community is to keep our finger on the pulse, engage with top talent, promote great content, facilitate relationships, and organize resources.  As a result of working with volunteers on Visualizing Christmas we will soon be able to offer workshops in data viz, something we've wanted to do since demand spiked after Andy Trice presented at nclud last summer.

In a similar vein, Antonio has been doing an excellent job putting these videos together and in time we hope to have a resource where people can easily navigate through these videos via time-tags based on subject, discussion, presenter, etc.  Once again we'll reach out to the community, and those interested in helping write that algorithm, can join DVDC and DC2 during periodic coworking sessions at 1776 or any coworking spaces interested in hosting us, where we'll get into all the details and other ongoing Data Community DC projects.

Plenty more to discuss, but I'd like to keep this post short(er), so join us at DVDC and let's go down the rabbit hole!

SynGlyphX: Hello and Thank You DC2!

The following is a sponsored post brought to you by one of the supporters of two of Data Community's five meetups.

Hello and Thank You DC2!

This week was my, and my company’s, introduction to Data Community DC (DC2).  We could not have asked for a more welcoming reception.  We attended and sponsored both Tuesday’s DVDC event on Data Journalism and Thursday’s DSDC event on GeoSpatial Data Analysis.  They were both pretty exciting, and timely, events for us.

SynglyphyxAs I mentioned, I’m new to DC2 and new to the “data as a science” community.  Don’t get me wrong, while I’m new to DC2 I’ve been awash in data my entire career.  I started as a young consultant reconciling discrepancies in the databases of a very early Client-Server implementation.  Basically, I had to make sure that all the big department store orders on the server were in sync with the home delivery client application.  A lot of manual reconciling that ultimately led to me programming code to semi-automatically reconcile the two databases.  Eventually (I think) they solved the technical issues that led the Client-Server databases being out of sync.

Synglyphyx2More recently, I was working for a company with a growing professional services organization.  The company typically hired new employees after a contract was signed; but the new professional services work involved short project durations.  If we waited to hire, the project would be over before someone started.  We developed a probability adjusted / portfolio analysis approach to compare supply of available resources (which is always changing as people finish projects, get extended, leave the organization) vs. demand (which is always changing as well), that enabled us to determine a range of positions and skillsets to hire for in a defined timeframe.

In both instances, it was data science that drove effective decision making.  Sure, you can apply some “gut” to any decision, but having some data science behind you makes the case much stronger.

So, I was fascinated to listen to the journalists discuss how they are applying data analysis to help:  1) support existing story lines; and 2) develop new story lines.  Nathan’s presentation on analyzing AIS data was interesting (and a bit timely as we had just gotten a verbal win for a client on doing similar type work, similar, but not exactly the same).

I know the power of data to solve complex business, operational, and other problems.  With our new company, SynGlyphX, we are focused on helping people both visualize and interact with their data.  We live in a world with sight and three dimensions.  We believe that by visualizing the data (unstructured, filtered, analyzed, any kind of data), we can help people leverage the power of the brain to identify patters, spot trends, and detect anomalies.  We joined DC2 to get to know folks in the community, generate some awareness for our company, and to get your feedback on what we are doing.  Thank you all for welcoming us and our company, SynGlyphX, to the community.  We appreciated everyone’s interest in the demonstrations of our interactive visualization technology.  Our website traffic was up significantly last week, so I am hoping this is a sign that you were interested in learning more about us.  Additionally, I have heard from a number of you since the events, and welcome hearing from more.

Here’s my call to action, I encourage you to tweet us your answer to the following question:  “Why do you find it helpful to visually interact with your data?”

See you at upcoming events.

Mark Sloan

About the Author:

As CEO of SynGlyphX, Mark brings over two decades of experience.  Mark began his career at Accenture, co-founded the global consulting firm RTM Consulting, and served as Vice President and General Manager of Convergys’ Consulting and Professional Services Group.

Mark has a M.B.A. from The Wharton School of the University of Pennsylvania, and a B.S. in Civil Engineering from the University of Notre Dame. He is a frequent speaker at industry events and has served as an Advisory Board Member for the Technology Professional Services Association (now Technology Services Industry Association (TSIA)).

General Assembly & DC2 Scholarship

GA DC2 Scholarship The DC2 mission statement emphasises that "Data Community DC is an organization committed to connecting and promoting the work of data professionals...", ultimately we see DC2 becoming a hub for data scientists interested in exploring new material, advancing their skills, collaborating, starting a business with data, mentoring others, teaching classes, changing careers, etc. Education is clearly a large part of any of these interests, and while DC2 has held a few workshops and is sponsored by organizations like Statistics.com, we knew we could do more and so we partnered with General Assembly and created a GA & DC2 scholarship specifically for members of Data Community DC.

For our first scholarship we landed on Front End Web Development and User Experience, which we naturally announced first at Data Viz DC.  How does this relate to data science?  As I was happy to rebut Mr. Gelman in our DC2 blogpost reply, sometimes I would love to have a little sandbox where I get to play with algorithms all day, but then again this is exactly what I've run away from in 2013 in becoming an independent data science consultant, I don't want a business plan I'm not a part of dictating what I can play with.  Enter Web Dev and UX.  As Harlan Harris, organizer of DSDC, mentions in his venn diagram on what makes a data scientist, which Tony Ojeda later emphasizes, programming is a natural and necessary part of being a data scientist.  In other words, there's this thing called the interwebs that has more data than you can shake a stick at, and if you can't operate in that environment then as a data scientist you're asking someone else to do that heavy lifting for you.

Over the next month we'll be choosing the winners of the GA DC2 Scholarship, and if you'd like to see any other scholarships in the future please leave your thoughts in the comments below or tweet us.

Happy Thanksgiving!

From Computer-Aided Journalism to Data Journalism

The Data Visualization DC Meetup was a very successful and enjoyable event: the location (Washington Post old printing press room), the attendance (250), the food (5 different types of wrap sandwiches), and the first roundtable event discussing Data Journalism. Jon Schwabish and Sean Gonzalez organized the meetup and presented a new visualization of the Data Visualization DC Meetup membership and other data sets in 3-D maps produced by SynGlyphXData Visualization DC is the fastest growing community within the Data Community DC. image002 The eminent roundtable panel consisted of the following:

  • Moderator: Frank Sesno is director of the School of Media and Public Affairs (SMPA) at The George Washington University, an Emmy-award winning journalist, and creator of http://PlanetForward.org, a user-driven web and television project that highlights innovations in sustainability.
  • Panelists:
    • Jeremy Bowers: Jeremy is a developer on the news applications team at NPR. Previously, he was a senior embedded developer at the Washington Post and news technologist at the St. Petersburg Times/Politifact.
    • Nikki Usher: Nikki Usher is a professor at The George Washington University. She studies news production in the digital age, and most recently has been at work studying the intersection of tech and journalism.
    • Derek Willis: Derek Willis is an interactive developer at The New York Times and works mostly on political projects. He maintains The Times' congressional and election data.
    • Kat Downs: Kat Downs is the Graphics Director at The Washington Post where she designs, develops and edits information graphics and multimedia projects.

Moderator Frank Sesno mentioned that new Washington Post owner, Amazon's Jeff Bezos, would have liked to have been there. He had each panelist define what data journalism is to them, and their responses were:

  • Jeremy Bowers: Based on collected data
  • Kat Downs: Making numbers consumable and meaningful
  • Derek Willis: About things that can be measured
  • Nikki Usher: What's different about it from computer-aided journalism (she did a two week stiint at the Guardian Newspaper which is considered to a be a gold standard for data journalism - see Data Journalism Handbook)

Moderator Sesno asked for closing sound bytes from each, starting with himself:

  • Frank Sesno: "Dazzle me with Statistics" from his multimedia reporting - story telling class at GWU
  • Kat Downs: Smart charting, maps, and timeline examples from her recent stories
  • Jeremy Bowers: Get smarter with readers
  • Derek Willis: Not progressing fast enough
  • Nikki Usher: Stay on the site to make money

The lively Q&A from audience included questions about Jeff Bezos plans for hiring (no response), automating screen scraping, reader preferences for types of visualizations, paying attention to reader feedback, etc.

Moderator Sesno concluded by thanking the panelists and audience for their participation and asking what GWU could do to help?

I am sure that Sean and the members will think of something, including being part of the retweet program.

The meetup adjourned to Data Drinks at General Assembly across the street from the Washington Post.

Brand Niemann, former Senior Enterprise Architect and Data Scientist with the US EPA, completed 30 years of federal service in 2010. Since then he has worked as a data scientist for a number of organizations, produced data science products for a large number of data sets, and published data stories for Federal Computer Week, Semantic Community and AOL Government.

 

Eclipse Foundation LocationTech DC Tour

LocationTechMap  

Interested in open source software for geospatial systems? Join us on November 14th at GWU for an evening of tech talks about location-aware open source technologies.

This month, the Eclipse Foundation's LocationTech working group is hosting a series of events in six cities, concluding with Washington, DC on November 14th. We'll gather at The George Washington University for a round of invigorating talks in the early evening, followed by drinks and networking at a local watering hole.

Speakers include:

  • Juan Marin, CTO of Boundless (formerly OpenGeo)
  • Eric Gundersen, CEO of MapBox
  • Joshua Campbell, GIS Architect at Humanitarian Information Unit, U.S. Department of State

When: Thursday, November 14, 2013 Where: Elliott School of International Affairs, GWU Time: 6pm to 9pm (followed by drinks at CIRCA in Foggy Bottom)

The event is free but space is limited. Register today at http://tour.locationtech.org/

About LocationTech

LocationTech is the Eclipse Foundation's industry working group focusing on location-aware technologies. Members of LocationTech are also full-fledged members of the Eclipse Foundation. Eclipse is a vendor-neutral community for individuals and organizations who wish to collaborate on commercially-friendly open source software. The Eclipse Foundation is a not-for-profit, member-supported corporation that hosts technology projects and helps cultivate both an open source community and an ecosystem of complementary products and services.

Fantastic presentations from R using slidify and rCharts

Ramnath Vaidynathan presenting in DCDr. Ramnath Vaidyanathan of McGill University gave an excellent presentation at a joint Data Visualization DC/Statistical Programming DC event on Monday, August 19 at nclud, on two R projects he leads -- slidify and rCharts. After the evening, all I can say is, Wow!! It's truly impressive to see what can be achieved in presentation and information-rich graphics directly from R. Again, wow!! (I think many of the attendees shared this sentiment)

Slidify

Slidify is a R package that

helps create, customize and share elegant, dynamic and interactive HTML5 documents through R Markdown.

We have blogged about slidify, but it was great to get an overview of slidify directly from the creator. Dr. Vaidyanathan explained that the underlying principle in developing slidify is the separation of the content and the appearance and behavior of the final product. He achieves this using HTML5 frameworks, layouts and widgets which are customizable (though he provides several here and through his slidifyExamples R package).

Example RMarkdown file for slidify

You start with a modified R Markdown file as seen here. This file can have chunks of R code in it. It is then processed to a pure Markdown file, interlacing the output of R code into the file. This is then split-apply-combined to produce the final HTML5 document. This document can be shared using GitHub, Dropbox or RPubs directly from R. Dr. Vaidyanathan gave examples of how slidify can even be used to create interactive quizzes or even interactive documents utilizing slidify and Shiny.

One really neat feature he demonstrated is the ability to embed an interactive R console within a slidify presentation. He explained that this used a Shiny server backend locally, or an OpenCPU backend if published online. This feature changes how presentations can be delivered, by not forcing the presenter to bounce around between windows but actually demonstrate within the presentations.

rCharts

rCharts is

an R package to create, customize and share interactive visualizations, using a lattice-like formula interface

Again, we have blogged about rCharts, but there have been several advances in the short time since then, both in rCharts and interactive documents that Dr. Vaidyanathan has developed.

rCharts creates a formula-driven interface to several Javascript graphics frameworks, including NVD3, Highcharts, Polycharts and Vega. This formula interface is familiar to R users, and makes the process of creating these charts quite straightforward. Some customization is possible, as well as putting in basic controls without having to use Shiny. We saw several examples of excellent interactive charts using simple R commands. There is even a gallery where users can contribute their rCharts creations. There is really no excuse any more for avoiding these technologies for visualization, and it makes life so much more interesting!!

Bikeshare maps, or how to create stellar interactive visualizations using R and Javascript

Dr. Vaidyanathan demonstrated one project which, I feel, shows the power of the technologies he is developing using R and Javascript. He created a web application using R, Shiny, his rCharts packages which accesses the Leaflet Javascript library, and a very little bit of Javascript magic to visualize the availability of bicycles at different stations in a bike sharing network. This application can automatically download real-time data and visualize availability in over 100 bike sharing systems worldwide. He focused on the London bike share map, which was fascinating in that it showed how bikes had moved from the city to the outer fringes at night. Clicking on any dot showed how many bikes were available at that station.

London Bike Share map Dr. Vaidyanathan quickly demonstrated a basic process of how to map points on a city map, how to change their appearance and how to add additional meta-data to each point, that will appear as a pop-up when clicked.

You can see the full project and how Dr. Vaidyanathan developed this application here.

Interactive learning environments

Finally, Dr. Vaidyanathan showed a new application he is developing using slidify, rCharts, and other open-source technologies like OpenCPU and PopcornJS. This application allows him to author a lesson in R Markdown, integrate interactive components including interactive R consoles, record the lesson as a screencast, sync the screencast with the slides, and publish it. This seems to me to be one possible future for presenting massive online courses. An example presentation is available here, and the project is hosted here

Open presentation

The presentation and all the relevant code and demos are hosted on GitHub, and the presentation can be seen (developed using slidify, naturally) here.

Stay tuned for an interview I did with Dr. Vaidyanathan earlier, which will be published here shortly.

Have fun using these fantastic tools in the R ecosystem to make really cool, informative presentations of your data projects. See you next time!!!

Data-driven presentations using Slidify

Presentations are the stock-in-trade for consultants, managers, teachers, public speakers, and, probably, you. We all have to present our work at some level, to someone we report to or to our peers, or to introduce newcomers to our work. Of course, presentations are passe, so why blog about it? There’s already PowerPoint, and maybe Keynote. What more need we talk about? slidify

Well, technology has changed, and vibrant dynamic presentations are here today for everyone to see. No, I mean literally everybody, if I like. All anyone will need is a web browser to see it. Graphs can be interactive, flow can be nonlinear, and presentations can be fun and memorable again!

But PowerPoint is so easy! You click, paste, type, add a bit of glitz, and you’re done, right? Well, as most of us can attest to, not really. It takes a bit more effort and putzing around to really get things in reasonable shape, let alone great shape.

And there are powerful alternatives. Which are simple and easy. And do a pretty great job on their own. Oh, and, by the way, if you have data and analysis results to present, super slick and a one-stop-shop from analysis to presentation. Really!! Actually there are a few out there, but I’m going to talk about just one. My favorite. Slidify.

Slidify is a fantastic R package that takes a document written in RMarkdown , which is Markdown (an easy text markup format) possibly interspersed with of R code that result in tables or figures or interactive graphics, weaves in the results of that code, and then formats it into beautiful web presentations using HTML5. You can decide on the format template ( it comes with quite a few) or brew your own. You can make your presentation look and behave the way you want, even like a Prezi (using ImpressJS). You can also make interactive questionnaires and even put in windows to code interactively within your presentation!!

A Slidify Demonstration

Slidify is obviously feature-rich, and infinitely customizable, but that’s not really what attracted me to it. It was the ability to write presentations in Markdown, which is super easy and let’s me put down content quickly without worrying about appearance (Between you and me, I’m writing this post in Markdown, on a Nexus 7). It lets me weave in results of my analyses easily, keeping the code in one place within my document. So when my data changes, I can create an updated presentation literally with the press of a button. Markdown is geared to create HTML documents. Pandoc lets you create HTML presentations from Markdown, but not living, data driven presentations like Slidify. I get to put my presentations up on Github or on Rpubs, or even in my  Dropbox, directly using Slidify, share the link, and I’m good to go.

Dr. Ramnath Vaidyanathan created Slidify to help him teach more effectively at McGill University, where he is on the Desautels Faculty of Management. But, for me, it is now the goto place for creating presentations , even if I don’t need to incorporate data. If you’re an analyst and live in the R ecosystem, I highly recommend Slidify. If you don’t and use other tools, Slidify is a great reason to come and see what R can do for you. Even if it to just create great presentations. There are plenty of great examples of what’s possible at http://ramnathv.github.io/slidifyExamples.

If you are in the DC metro area, come see Slidify in action. Dr. Vaidyanathan presents at a joint Statistical Programming DC / Data Visualization DC meetup on both Slidify and his other brainchildren, rCharts (which can create really cool and dynamic visualizations from R, see Sean's blog) and rNotebook on August 19. See the announcements at SPDC and DVDC, sign up, and we’ll see you there.