“Why IT Fumbles Analytics Projects”

Someone pointed me to this Harvard Business Review article by Donald Marchand and Joe Peppard, “Why IT Fumbles Analytics,” which begins as follows:

In their quest to extract insights from the massive amounts of data now available from internal and external sources, many companies are spending heavily on IT tools and hiring data scientists. Yet most are struggling to achieve a worthwhile return. That’s because they treat their big data and analytics projects the same way they treat all IT projects, not realizing that the two are completely different animals.

Interesting! I was expecting something pretty generic, but this seems to be leading in an unusual direction. Marchand and Peppard continue:

The conventional approach to an IT project, such as the installation of an ERP or a CRM system, focuses on building and deploying the technology on time, to plan, and within budget. . . . Despite the horror stories we’ve all heard, this approach works fine if the goal is to improve business processes and if companies manage the resulting organizational change effectively.

But we have seen time and again that even when such projects improve efficiency, lower costs, and increase productivity, executives are still dissatisfied. The reason: Once the system goes live, no one pays any attention to figuring out how to use the information it generates to make better decisions or gain deeper—and perhaps unanticipated—insights into key aspects of the business. . . .

Our research, which has involved studying more than 50 international organizations in a variety of industries, has identified an alternative approach to big data and analytics projects . . . rather than viewing information as a resource that resides in databases—which works well for designing and implementing conventional IT systems—it sees information as something that people themselves make valuable.

OK, I don’t know anything about their research, but I like some of their themes:

It’s crucial to understand how people create and use information. This means that project teams need members well versed in the cognitive and behavioral sciences, not just in engineering, computer science, and math.

I’m a bit miffed that they didn’t mention statistics at all here (“math”? Really??), but I’m with them in their larger point that communication is central to any serious data project. We have to move away from the idea that we do the hard stuff and then communication is just public relations. No! Communication should be “baked in” to the project, as Bob C. would say.

One more thing

One thing that Marchand and Peppard didn’t mention, but is closely related to their themes, is that people make big claims about the effect of analytics, but ironically these claims are just made up, they’re not themselves data-based. We saw this a couple years ago with a claim that “one or two patients died per week in a certain smallish town because of the lack of information flow between the hospital’s emergency room and the nearby mental health clinic.” Upon a careful look, these numbers (saving 75 people a year in a “smallish town”!) fell apart, and the person who promoted this claim has never shown up to defend it.

Hype can occur in any field, but I get particularly annoyed when someone hypes the benefits of data technology without reference to any data (or even, in this case, the name of the “smallish town”). Business books (you know, the ones you see at the airport) seem to be just full of this sort of story.

29 thoughts on ““Why IT Fumbles Analytics Projects”

  1. Giving the authors the benefit of the doubt: they’re affiliated with international schools and give a number of British examples, so maybe ‘math’ is supposed to be ‘maths’? Or something more generic that includes statistics.

      • Its murky, for instance, my degree from the Statistics department is considered a maths degree and I was often referred to as a maths student. So I think Alex has a point.

        Now recently in Ottawa, the local media reported on a local stats in sports seminar as an advanced mathematics seminar with no mention of statistics at all …

  2. I totally agree with you viz. the hype machine. Unfortunately, the funding structure for most companies requires that level of hype to get the project funded. Even small demonstration projects (to show that a larger project is warranted) are very tough to fund, as a large amount of work needs to be done to implement even a “small data” project. (Finding and cleaning the data, etc.)

    If, after such an analytics project does get funded (I have seen) rarely does anyone retroactively examine if the hype matches the results. I am speaking of large industrial firms, not internet analytics projects BTW.

  3. Interesting, this came out in some IBM analytics (sales pitching) talks in their advice not to focus on building and deploying the technology but rather just find a current problem someone in the organisations is grappling with that might be informed by data and work that problem through as a case study (acquiring just the needed expertise and resources) and then build up from there more generally to other data science projects.

    I though many aspects of the talks were quite good, but one needs to keep in mind they we trying to sell data science consulting.

    Analytics in Government: Results Based Outcomes with IBM Predictive Analysis for Cost Avoidance and Beyond
    http://www.slideshare.net/dawnrk/ibm-ofa-ottawaanalyticsingov-campbellrobertson-53076774

    Government Agencies and Next Generation Analytics: Optimizing Mission and Business Outcomes
    http://www.slideshare.net/dawnrk/ibm-ofa-ottawagovagenciesandnextgenerationanalyticstimpaydospdf-53076968

    • Keith
      Interesting links – they seem to me to be at odds with the article Andrew is drawing attention to. The IBM slides look very much like a traditional IT “solution.” While they have plenty of words about “context” “decisions” etc., there is little mention of the need for expertise outside of the analytics/IT domains, nor do they seem to capture the spirit of focusing on business processes rather than data acquisition or analysis.

      • This came out very clearly in the presentation and in face to face discussions, though you may be right that it is not clear in the slides them selves.

  4. I don’t think this is even limited to ‘IT’. I work in a UK public sector organisation, we have many statutory requirements to process data, produce statistics and return many of them to other government organisations. But that ends up being all we do, lots of number crunching, lots of stats production, but little actual analysis. Few people seem to have the time to look at those numbers and try and understand what they mean.

    As a former boss described it, we do a lot of data and information, very little intelligence and insight.

    Analysis is something that is often assumed to be a fixed thing that you can decide well in advance of even looking at the data. We run surveys and consultations and analysis of the results is something managers assume takes a day or 2 at most and only really involves counting the number of responses for each question/answer…

  5. The title seems a bit disingenuous to me. If the IT team is expected to execute these projects the same way they execute other projects — assess and build the right infrastructure based on a timetable — then it seems like IT is not fumbling anything.

    In my experience with this sort of thing, IT is usually perfectly able to give the researchers / business intelligence staff / domain scientists what they are asking for. The bigger problem is that those folks tend to want cookbook-style statistics that they can control from infantile dashboards and GUIs.

    I speculate that the actual goal with most big data systems is to give managers / business intelligence staff more metrics, basically more surface area, from which they can data mine whatever kind of political story about “insights” or “KPIs” or other nonsense that they want.

    Installing the latest and greatest streaming analytics tool, cloud-based infrastructure, and so forth, is not about driving the business to greater statistical clarity. It’s about allowing executives to intentionally create Dutch-book-like political stakes on all the different possible outcomes of a business intelligence project.

    It’s pretty genius, in fact, that they are so good at this political maneuvering that they can get the title of an article like this to be “… IT Fumbles …”

    • I think there’s multiple problems: Expectations out of Analytics are often unexpectedly high. Vendors are partly responsible. They oversell their products.

      Another problem is companies over-invest in the tools when they ought to be in the people. Partly this is driven by the vendor created illusion that their tools are so easy to use. Having a hundred nifty looking graphs created in a jiffy gives a false sense of power. Converting the cool data into profitability is not trivial.

      People keep trying to package analytics into these automated, clean, WYSIWYG workflows but somehow those never really work. The production databases never expose all their critical innards to the shiny analytics tools.

      Anecdotally the best results are gotten by people who are good at the domain knowledge + the IT + the Statistics. Unfortunately such people are rare. What’s worse, the skills that actually work rarely get you hired most often because the guy hiring for the position has never really gotten his hands dirty at analytics.

      • Your last paragraph describes the pain of job searching for me exactly. All my training is in machine learning and statistics, but I think at this point I am equally skilled in very low-level computing topics, and even some more advanced computing topics like functional programming — and I’ve been around the block with many different databases, streaming analytics systems, Linux administration issues, etc. etc.

        Most people want to put me either in the “data science, but doesn’t really know how to program” bucket, or in the “great programmer who only knows a little about data science” bucket. When I try to explain that, no, a person can be extremely good at both software development and also statistics, it’s like I strike some kind of cognitive dissonance nerve, they break out into a cold sweat, and I have to shuffle along to keep searching.

        It seems like there is so much written about how hard it is to find people who can simultaneously design good software systems and also do good stats — but I can’t find anyone who is actually hiring for that combination of skills (and I’ve really scraped the bottom of the barrel in my job search).

        • Ely – where are you located? My experience is that the large internet companies (Yahoo/Facebook/Google/LinkedIn) are looking for exactly the type of person who can both do the statistics and write production code.

        • I am in the eastern U.S. and willing to relocate pretty much anywhere for a job that I felt was good fitting. I can tell you with a lot of certainty that the big tech folks are not looking for this type of worker. They tend to only want people from both of the specialist buckets I described above. The roles are typically presented either as (a) “pure” research, meaning you are expected to be a PhD-type researcher who can get by coding, but who primarily is there because you invent new research, or else (b) you “sit in between” or “on the boundary of” research and production, and so you’re expected to be a production-quality programmer with an interest and maybe some slight experience in the applied domain or the statistics side of things.

          If you want to do both things — invent the new research and also do it, from the start, in a production-quality manner of programming, it just doesn’t fit into their teams. Or at least it never succeeds in getting past HR.

          Just my experience. YMMV.

        • I tried wrestling with this idea for a while. If nobody is doing accurate work, and if you can bring proper coding principles and statistical hygiene to bear on a problem, then there should be some kind of free lunch you can eat, right?

          I don’t think so. I think Robin Hanson explained this really well in his essay, “Who Cares About Forecast Accuracy?” [0].

          The gist of it is that the people who are the positions of authority where prediction accuracy might be most valuable also derive value from a whole bunch of other things, like credential, being entertaining, carrying an aura of authority, and so forth.

          If more accurate predictions undermines those things, then the extra value added by the more accurate predictions might not be worth enough to compensate for the loss of authority, the loss of impressive-seeming-ness, and so forth.

          We’d love it if all everyone cared about was objective bottom lines, like they say that they do, but they obviously don’t.

          This is why I think big showy investments into business intelligence tech function more like business intelligence theater, an analogue to the TSA’s security theater. The B.I. tech isn’t there to actually raise accuracy, thereby undermining the social status and political value of certain people. Rather it is there just to be used as a tool to further affirm the status and political value of the people who oversee it.

          To me, rather than suggesting to start a business in this area, it gives me a scary existential crisis that maybe the right response is simple to just stop trying to work in statistics at all — it’s a field predicated on improving forecasting accuracy, but whose only customers tend not to want to improve accuracy too much, so they can preserve the political or status value of seeming right and having authority.

          [0] .

        • (Sorry the Hanson link didn’t come through. But it’s easy to find at the CATO website if you search for the title.)

        • +1
          It’s comforting to put everyone into boxes. It also makes those who know just stats or programming feel better about their place in the world.

  6. Part of the issue is the use of IT to mean Information Systems. The IT shop in any organization is responsible for way more than data collection and storage, like the networks, hardware, maintaining/administering servers, etc… Unfortunately, those tend to be the most visible parts of the work that IT does (e.g., some calls for help because Windoze is acting like it does and then someone drops by to fix things for them). Little consideration seems to be paid to the cost and effort required to maintain a robust, high quality, and highly available data system outside of tech firms, which ends up leading to data landfills rather than data systems. I also think that some division between IS and analytics can be beneficial and help the organization to invest poorly in their human capital (I’m fairly comfortable working with several different data management/storage solutions but am in no way a substitute for a good database developer/DBA). In the experiences I’ve had in the public sector in the US it seems more often that IT shops have a nearly obsessive need to control everything data and are willing to fight tooth and nail to maintain that level of control. When those same shops lack any staff with any training/background in data analysis/statistics they still try to maintain a death grip on control rather than leveraging the collective strength of IS folks to build and develop highly performant systems and analytics staff to use and work with those systems to provide high quality analytics.

    I could be a bit jaded from my experiences, which have been near polar opposites, but it seems little progress will be made as long as this divisive paradigm continues to reign supreme in the IT world.

  7. Organization psychology and organizational learning are real fields. There are real bodies of literature. There are real experts.

    Change is hard. Data does not make people or organizatons more open to change, as shocking as that may sound. Luckily, there are field that study these things.

    Of course, this moves away from the apparent ease of having a small team come up with The Right Answers to be immiedately implemented at scale — one of the big draws of these endeavors.

    Sure, this approach can lead to insights and lessons, but implenting them is the same challenge that organizational change has long been.

    • That could be one reason why smaller (but smart) organizations have made best use of analytics data.

      Increasingly, in the new economy the massive advantages of scale, as we knew it, are gone. Scale measured by employee count.

      • This last point seems wrong to me. Almost all data analysis has economies of scale so that the outputs will generally be more valuable to larger organizations (smaller standard errors, more accurate predictions, etc.). The supposed theoretical advantages of using data for small business is a story about “leveling the playing field” but the reality is that larger organizations are much more able to make productive use of data analysis – if, for no other reason, due to the larger sample sizes they have.

        • @Dale

          Sorry, I wasn’t clear in saying what I meant: I totally agree with you about the data analysis economies of scale.

          What I meant was the best advantages have been gotten by companies with relatively few people dealing with a lot of data. The new economy has decorrelated the size / worth of businesses from the employee-number of the underlying organization.

          i.e. If you have organizational sprawl that comes in the way of trying to leverage analytics.

  8. Data content and meaning are orthogonal to information technology. Think about web pages: the technology that serves up and displays a web page doesn’t care if the page is from Gelman or Stormfront. The implication is that IT people have been carefully trained to ignore data content in favor of data form. This causes problems when it comes to working with the actual data.

  9. As a consultant who has to deliver these “big data” projects I have learned that the value is derived from what you do with the data. Most middle and upper managers are simply incapable of acting on their data even if it was perfectly accurate. This is why I hate BI projects as they tend to turn into glorified fish tanks where executives fight over what color the buttons should be. I have to agree with Ely Spears, that KPI’s often reduce to “who/what can I blame” rather than anything productive. Most of the value these days is in automating/distintermediating the managers by bringing the person making a decision be it low level staff, an executive, or the customer.

  10. Here’s my consulting work. I do a nice multivariable analysis of the organization’s data…as a scientist, which I am. I know my statistics, but I’m even better at thinking about data scientifically.

    I ask for all of their data, and then I treat it all in my analysis. From it, I derive a series of explanatory stories that describe the organization, how it operates, and how the organization’s customers use their products.

    My main value to the organization is in strategic planning. The boards of directors love the analysis. The management team is a bit bummed that I’m not able to help them capture operational efficiencies or better revenue. But they do end up writing sounder, more realistic budgets, which in turn helps them hit their budget, which in turn helps them keep their jobs.

    • If you have figured out a way to get all of the data you need to actually answer the important questions, please share that with the rest of us. As someone who has worked both internally and externally to large organizations, the biggest challenge is getting all of the data from all of the systems that do not talk to each other so that it can all he included in the analysis. In my experience, organizational silos and fiefdoms create the greatest barriers.

Leave a Reply to Keith O'Rourke Cancel reply

Your email address will not be published. Required fields are marked *