Is your data warehouse architecture starting to look like a Rube Goldberg machine held together with duct tape instead of the elegant solution enabling data driven decision making that you envisioned? If your organization is anything like the ones I’ve worked in, then I suspect it might. Many businesses say they recognize that data is an asset, but when it comes to implementing solutions, the focus on providing business value is quickly lost as technical complexities pile up.
How can you recognize if your data warehouse is getting too complicated?
Does it have multiple layers that capture the same data in just a slightly different way? An organization I worked with determined that they needed 4 database layers (staging, long term staging, enterprise data warehouse, and data marts) with significant amounts of duplication. The duplication resulted from each layer not having a clear purpose, but even with more clarity on purpose, this architecture makes adding, changing and maintaining data harder at every turn.
Are you using technologies just because you have used them in the past? Or thought they would be cool to try out? An organization I worked with implemented a fantastically simple data warehouse star schema (http://en.wikipedia.org/wiki/Star_schema) with well under 500 GB of data. Unfortunately, they decided to complicate the data warehouse by adding a semantic layer to support a BI tool and an OLAP cube (which is some ways was a second semantic layer to support BI tools). There is nothing wrong with semantic layers or OLAP cubes. In fact, there are many valid reasons to use them. But, if you do not have said valid reason, they become just another piece of the data architecture that requires maintenance. Has someone asked for data that “should” be easy to get, but instead will take weeks of dedicated effort to pull together? I frequently encounter requests that sound simple, but the number of underlying systems involved and the lack of consistent data integration practices expands the scope exponentially.
Before I bring up too many bad memories of technical complexities taking over a BI/DW project, I want to get into what to do to avoid making things overcomplicated. The most important thing is to find a way that works for your organization to stay focused on business value.
If you find yourself thinking…
“The data is in staging, but we need to transform it into the operational data store, enterprise data warehouse and update 5 data marts before anyone can access the data.”
“I am going to try to because I want to learn more about it.”
“I keep having to pull together the customer data and it takes 2 weeks just to get an approved list of all customers.”
Stop, drop and roll, oh wait, you’re not technically on fire, so just stopping should do. Take some time to consider how to reset so that the focus is on providing business value. You might try using an approach such as the 5 Whys which was developed for root cause analysis by Sakichi Toyoda for Toyota . It forces reflection on a specific problem and helps you drill down into the specific cause. Why not try it out to see if you can find the root cause of complexity in a BI/DW project? It might just help you reduce or eliminate complexities when there is no good reason for the complexity in the first place.
Another suggestion is to identify areas of complexity from a technical perspective, but don’t stop there. The crucial next step is to determine how the complex technical environment impacts business users. For example, a technical team identifies two complex ETL processes for loading sales and HR data. Both include one off logic and processes that make it difficult to discern what is going on so it takes hours to troubleshoot issues that arise. In addition, the performance of both ETL processes has significantly degraded. The business users don’t really care about all that, but they have been complaining more and more about delays in getting the latest sales figures. When you connect the growing complexity to the delays in getting important data, the business users can productively contribute to a discussion on priority and business value. In this case, sales data would take clear precedence over HR data. Both can be added to a backlog, along with any other areas of complexity identified, and addressed in priority order.
Neither of these is a quick fix, but even slowly chipping away at overly complex areas will yield immediate benefits. Each simplification makes understanding, maintaining and extending the existing system easier.
Sara Handel, Co-founder of DC Business Intelligentsia, Business Intelligence Lead at Excella Consulting (www.excella.com) - I love working on projects that transform data into clear, meaningful information that can help business leaders shape their strategies and make better decisions. My systems engineering education coupled with my expertise in Business Intelligence (BI) allows me to help clients find ways to maximize their systems' data capabilities from requirements through long term maintenance. I co-founded the DC Business Intelligentsia meetup to foster a community of people interested in contributing to the better use of data.