Our December Data Science DC Meetup was about Data Science in Political Campaigns, and we had four experts come in to give short presentations about their work and then participate in a panel discussion. Here is an overview of each of the four presentations, including slides and audio where available. For those of you who couldn't make it to the event, hopefully this gives you a good sense of what was presented and for those who were there, hopefully this lets you relive the political data magic.
Sasha Issenberg First to present was Sasha Issenberg, a political columnist and author of the book The Victory Lab. Sasha laid the groundwork for the evening, speaking about what kinds of data political campaigns have available to them and the major strategies they employ. He talked about how information is used to model scenarios and how individuals will vote. The quantity of data available is impressive, as there are now hundreds of variables to model against voting data.
Sasha also spoke about some of the ways the data is modeled, such as forming subgroups from individual-level data to measure which of those subgroups are responding to your tactics, using the data to gain a better understanding of what would mobilize someone to vote or move them toward voting a certain way, and being able to divide the electorate into smaller groups and figuring out how to move them.
Shrayes Ramesh Next up was Shrayes Ramesh, a PhD Candidate in the Department of Economics at the University of Maryland, College Park, and a Data Scientist and Senior Technical Instructor at EMC. Shrayes gave a presentation about the correlation between political monetary contributions and changes in the political climate, the difficulties encountered in modeling these things, and some solutions that can lead to better models. He described some of the problems, such as unobserved factors caused by lack of data, that can lead to overestimating the impact contributions have on vote share.
Sharayes then presented some potential solutions, the first of which involved finding variables correlated with contributions but uncorrelated with election outcomes. Another solution involved what he called exogenous exits, or looking at exits from political office that were not the results of losing elections (ex. resignations, deaths, promotions, etc.). His final message to the audience was to to be cautious about studies that show correlation in politics and to ask yourself whether they are really telling the whole story.
Ken Strasma Ken Strasma was the first of two speakers presenting on micro-targeting in political campaigns. Ken worked on the data science team employed by the Obama campaign during the 2012 election. He started out by describing how micro-targeting utilizes statistical models to predict voter behaviors at the individual level. He mentioned that there is a lot of skepticism around the effectiveness of micro-targeting and that the media tends to focus on one or two of the most interesting indicators when reporting stories about it - for example, that cat or dog owners tend to vote a certain way. However, Ken said that most of the competitive advantage comes from doing the boring things and doing them very well.
He also devoted some time to talking about what voters are worthwhile targets, specifically taking into consideration whether or not they are persuadable and what message you can deliver to get them to act favorably. Finally, Ken also drew some comparisons between the political campaigns and commercial marketing, focusing on the lessons the commercial world can learn from the micro-targeting methods used in politics.
Alex Lundry Last but certainly not least was Alex Lundry, who worked on the Romney campaign's data science team for the 2012 presidential election. Alex provided examples of the types of data analysis the Romney campaign used throughout the election season. This included monitoring paid media for both campaigns, tracking what proportion of TV spots aired for each party each day, tracking the different commercials that the opposition aired over time, factor analysis to determine how disperse an ad buy was, and using predictive modeling to try to measure how likely people were to vote and who they would vote for if they did. They also conducted real-time assessments of the opposition's convention speeches by gender and partisanship, did some experimentation with analyzing the effects on voters when candidates visit their towns, and conducted sentiment analysis of political discussions.
As far as tools, Alex said they relied heavily on the R statistical programming language and on Tableau's data discovery and visualization software for the bulk of their analyses.
Panel Discussion After the presentations, there was a panel discussion where the presenters answered questions such as whether they would prefer to have better algorithms or more data, how they measure results when the results occur mostly on a single day, and what they predict for the 2016 election season.