The failure rate for technology projects continues to be over 50%, and some estimate it’s even higher. While that statistic may or may not surprise you, what will surprise you is how much data management strategy (or lack thereof) contributes to that failure rate, directly or indirectly. People blame what they can see, and because poor data management happens behind the scenes, it is often a silent killer.
Many projects start out without any data management plan whatsoever; and, once the system goes live, data management falls on either the application developers or the operations/support people, growing organically without any long-term plan. The project suffers as a result.
In a recent study, the Capability Maturity Model Integration Institute (CMMI) found that data management played a part in 100% of the technology failures surveyed. It probably isn’t quite that high in reality, but poor data management does often masquerade as other problems, including scope creep, poor technology choices, poor planning, etc. Even if we play devil’s advocate and assume one of these other factors is the main cause of a technology project failure, further analysis will often reveal poor data management as a significant underlying contributor to that cause. Perhaps you could have done a better job planning if you had a better understanding of the quality issues inherent in the data you needed to fuel your application. Or maybe you wouldn’t have gotten off schedule if your best developers weren’t constantly pulled away from their work to chase data content problems.
Unlike developers or service professionals, data experts ask data-centric questions up front: Where is the data sourced? How reliable are those sources? Are there multiple, redundant sources? How do you handle discrepancies when they happen? What are the service level agreements with providers? How will you handle overrides when errors occur?
Case Study #1: Override Overdrive
Take the last question, for example. An application ETLs a daily batch feed for core content without which the application cannot function. You have a 24-hour service level agreement with the provider, so you figure you’re covered. Once the project goes live, though, you discover a much higher failure rate than you anticipated. So what do you do? Typically, it starts with your service people. They field the calls and investigate to find that the vendor has sent an incorrect value. Your service team notifies the vendor and replies to the user saying the vendor will correct the value in the next overnight run. You find that your users can’t live with a 24-hour turnaround on frequent data errors introduced by your vendors. Your service team knows what the right data point should be, so you have them correct it. You may have to engage development to create a mechanism to facilitate mid-day updates. That night, you get another bad value from the same vendor which resets the service team’s update. The next day, the same client is calling in with the same problem yet again. So, you have a developer make a change to “lock” an overridden value. This works for a while, right up until the time that the vendor has corrected their root problem and sends you a legitimate update. Now that data point is wrong again, likely affecting the same user and this time the cause is not the vendor but your own reactive override logic.
Garbage in, garbage out.
Case Study #2: Fuzzy Definitions
In 2006, the New York Stock Exchange merged with the Archipelago Electronic Exchange. A large, top-tier electronic trading broker had separate designations for shares traded on the NYSE. Because shares can trade on different exchanges, electronic trading operations typically have one exchange they consider the “primary” exchange.
The primary exchange is a loosely defined concept from a data management standpoint. In this case, traders considered the most liquid exchange to be the primary whereas middle- and back-office professionals identified the primary as the exchange on which the security was originally listed. Both reasonable interpretations of a poorly defined, but critical, data element.
It wasn’t an issue until, sometime in 2008, a junior trader felt he wasn’t getting as much liquidity as he thought he should. The conclusion he reached was that NYSE exchange orders were not being routed to ARCA because the primary exchange was set to NYSE. He thought that ARCA was more liquid and that his orders should be routed there. What he did not know was that NYSE orders were being automatically routed to ARCA. What was also not known to him, or to support, was that the order execution software could not handle NYSE orders that were directly routed to ARCA. So, when he called the support desk and instructed the operator to change the primary exchange for all New York stocks to ARCA from NYSE, they complied.
The resulting fallout shut down electronic trading for two days at this firm. Many clients, including another top-tier bank that had been a consistent $10MM/year commission client, left as a result of the outage.
There are multiple touch points where a data expert could have helped to avoid this outage. For example, a data expert could have push for a well-defined, agreed-upon definition of the primary exchange concept or she could have acted as an escalation point when the support staff was asked to make a massive change to all of the stocks on the most active stock exchange in the world.
Case Study #3: Reluctant Data Sources
Another top firm made three different (and expensive) attempts to develop an in-house client relationship management system. On each attempt, the newly developed system failed to take hold and was rejected by users who cited various problems, including poor usability. On one of these exercises, the firm spent over $40MM developing a “user friendly” version of the application to address the stated concerns. It wasn’t until much later that they realized the failure in actuality had nothing to do with usability: it was in fact because the firm’s sales staff had no motivation to share their rolodexes with other agents.
The ultimate solution cost nothing. Senior management simply mandated that employees use the CRM, a far less expensive approach than the friendly inhouse application they built and had since abandoned. Data experts are trained to look at how data is sourced and to identify potential pitfalls. Such expertise could have helped this firm avoid these costly failed technology projects.
So, what is the solution? It’s certainly not a “one size fits all” question, and each organization needs to look at their own needs, risk, and exposure. The point is that if you want your technology project to succeed, your data management strategy should be part of your plans from the beginning. Many larger firms have gone as far as to establish the role of a Chief Data Officer to take ultimate responsibility for data management. A decent understanding of the data management maturity model developed by the EDM council and the CMMI is certainly a good place to start, but your solution may be as simple as establishing who is responsible for data management issues and what project requirements are necessary to support those responsibilities.
The point is, no matter the size or scope of your project, if someone asks you what your data management strategy is and you don’t really have an answer, be prepared for your project to become yet another statistic.