Elements of an Analytics "Education"

This a guest post by Wen Phan, who will be completing a Master of Science in Business at George Washington University (GWU) School of Business.  Wen is the recipient of the GWU Business Analytics Award for Excellence and Chair of the Business Analytics Symposium, a full-day symposium on business analytics on Friday, May 30th -- all are invited to attend. Follow Wen on Twitter @wenphan.

GWU Business Analytics Symposium, 5/30/14, Marvin CenterWe have read the infamous McKinsey report. There is the estimated 140,000- to 190,000-person shortage of deep analytic talent by 2018, and an even bigger need - 1.5 million professionals - for those who can manage and consume analytical content. Justin Timberlake brought sexy back in 2006, but it’ll be the data scientist that will bring sexy to the 21st century. While data scientists are arguably the poster child of this most recent data hype, savvy data professionals are really required across many levels and functions of an organization. Consequently, a number of new and specialized advanced degree programs in data and analytics have emerged over the past several years – many of which are not housed in the traditional analytical departments, such as statistics, computer science or math. These programs are becoming increasingly competitive and graduates of these programs are skilled and in demand. For many just completing their undergraduate degrees or with just a few years of experience, these data degrees have become a viable option in developing skills and connections for a burgeoning industry. For others with several years of experience in adjacent fields, such as myself, such educational opportunities provide a way to help with career transitions and advancement.

I came back to school after having worked for a little over a decade. My undergraduate degree is in electrical engineering and at one point in my career, I worked on some of the most advanced microchips in the world. But I also have experience in operations, software engineering, product management, and marketing. Through it all, I have learned about the art and science of designing and delivering technology and products from ground zero - both from technical and business perspectives. My decision to leave a comfortable, well-paid job to return to school was made in order to leverage my technical and business experience in new ways and gain new skills and experiences to increase my ability to make an impact in organizations.

There are many opinions regarding what is important in an analytics education and just as many options to pursuing them, each with their own merits. Given that, I do believe there are a few competencies that should be developed no matter what educational path one takes, whether it is graduate school, MOOCs, or self-learning. What I offer here are some personal thoughts on these considerations based on my own background, previous professional experiences, and recent educational endeavor with analytics and, more broadly, using technology and problem solving to advance organizational goals.

Not just stats.

For many, analytics is about statistics and a data degree is just slightly different from a statistics one. There is no doubt that statistics plays a major role in analytics, but it is still just one of the technical skills. If you are a serious direct handler of data of any kind, it will be obvious that programming chops are almost a must. For more customized and sophisticated processing, even substantial computer science knowledge – data structures, algorithms, and design patterns – will be required. Of course, even this idea has been pretty mainstream and is nicely captured by Drew Conway’s Data Science Venn Diagram. Other areas not as obvious to data competency are that of data storage theory and implementation (e.g. relational databases and data warehouses), operations research, and decision analysis. The computer science and statistics portions really focus on the sexy predictive modeling aspects of data. That said, knowing how to effectively collect and store data upstream is tremendously valuable. After all, it is often the case that data extends beyond just one analysis or model. Data begets more data (e.g. data gravity). Many of the underlying statistical methods, such as maximum likelihood estimation (MLE), neural networks and support vector machines, all rely on principles and techniques of operations research. Further, operations research, also called optimization, offers a prescriptive perspective on analytics. Last, it is obvious that analytics can help identify trends, understand customers, and forecast the future. However, in and of themselves those activities do not add any value; it is the decisions and resulting actions taken on those activities that deliver value. But, often, these decisions must be made in the face of substantial uncertainty and risk - hence the importance of critical decision analysis. The level of expertise required in various technical domains must align with your professional goals, but a basic knowledge of the above should allow you adequate fluency across analytics activities.

Applied.

I consider analytics an applied degree similar to how engineering is an applied degree. Engineering applies math and science to solve problems. Analytics is similar this way. One importance of applied fields is that they are where the rubber of theory needs to meet the road of reality. Data is not always normally distributed. In fact data is not always valid or even consistent. Formal education offers rigor in developing strong foundational knowledge and skills. However, just as important are the skills to deal with reality. It is no myth that 80% of analytics is just about pre-processing the data; I call it dealing with reality. It is important to understand the theory behind the models, and frankly, it’s pretty fun to indulge in the intricacies of machine learning and convex optimization. In the end though, those things have been made relatively straightforward to implement with computers. What hasn’t (yet) been nicely encapsulated in computer software is the judgment and skill required to handle the ugliness of real-world data. You know what else is reality? Teammates, communication, and project management constraints. All this is to say that so much of an analytics education includes other areas that are not the theory, and I would argue that the success of many analytics endeavors are limited not by the theoretical knowledge, but rather by the practicalities of implementation whether with data, machines, or people. My personal recommendation to aspiring or budding data geeks is to cut your teeth as much as possible in dealing with reality. Do projects. As many of them as possible. With real data. And real stakeholders. And, for those of you manager types, give it a try; it’ll give you the empathy and perspective to effectively work with the hardcore data scientists and manage the analytics process.

Working with complexity and ambiguity.

The funny thing about data is that you have problems both when you have too little and too much of it. With too little data, you are often making inferences and assessing the confidence of those inferences. With too much data, you are trying not to get confused. In the best case scenarios, your objectives in mining the data are straightforward and crystal clear. However, that is often not the case and exploration is required. Navigating this process of exploration and value discovery can be complex and ambiguous. There are the questions of “where do I start?” and “how far do I go?” This really speaks to the art of working with data. You pick up best practices along the way and develop some of your own. Initial exploration tactics may be as simple as profiling all attributes and computing correlations among a few of thing, seeing if anything looks promising or sticks. This process is further exacerbated with “big data”, where computational time is non-negligible and limits feedback delays during any kind of exploratory data analysis.

You can search the web for all kinds of advice on skills to develop for a data career. The few tidbits I include above are just my perspectives on some of the higher order bits in developing solid data skills. Advanced degree programs offer compelling environments to build these skills and gain exposure in an efficient way, including a professional network, resources, and opportunities. However, it is not the only way. As with all professional endeavors, one needs to assess his or her goals, background, and situation to ultimately determine the educational path that makes sense.

References:

[1] James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Angela Hung Byers. “Big Data: The Next Frontier for Innovation, Competition, and Productivity.” McKinsey Global Institute. June 2011.

[2] Thomas H. Davenport, D.J. Patil . “Data Scientist: The Sexiest Job of the 21st Century.” Harvard Business Review. October 2012.

[3] Quentin Hardy. "What Are the Odds That Stats Would Be This Popular?" The New York Times. January 26, 2012. 

[4] Patrick Thibodeau. “Career alert: A Master of analytics degree is the ticket – if you can get into class”. Computerworld. April 24, 2014. 

[5] Drew Conway. “The Data Science Venn Diagram.” 

[6] Kristin P. Bennett, Emilio Parrado-Hernandez. “The Interplay of Optimization and Machine Learning Research.” Journal of Machine Learning Research 7. 2006. 

[7] Mousumi Ghosh. “7 Key Skills of Effective Data Scientists.” Data Science Central. March 14, 2014.

[8] Anmol Rajpurohit. “Is Data Scientist the right career path for you? Candid advice.” KDnuggets. March 27, 2014.