“Gross misuse of statistics” can be a good thing, if it indicates the acceptance of the importance of statistical reasoning

Rick Lightburn writes:

I [Lightburn] am also a member of the group Business Analytics on LinkedIn. I am struck by what I perceive as the gross misuse of statistics by the members of this group, including things that (I thought) were taught in Introductory Statistics courses in business schools. I want to suggest to you that you look at the discussions there if you want examples of such abuse.

The discussions there support me in my belief that Analytics is data manipulation in the support of previously developed conclusions.

My reply: I don’t think it’s such a bad thing. I like when people make statistical arguments, even bad statistical arguments. Once you accept the concept of arguing from logic and data, maybe you’ll be open to learning something new

10 thoughts on ““Gross misuse of statistics” can be a good thing, if it indicates the acceptance of the importance of statistical reasoning

  1. You seem to discount those who engage in serial lying to impress the unengaged. Indeed, they know people care about hard data and hence work hard to fabricate data. And they are the loudest. Else we won’t have climate denialists, Laffer curvists, and disevolutionists around.

  2. If I may play the Devil’s Advocate: An improper statistical argument has an additional air of authority over an improper anecdotal argument. It’s like an Appeal to Authority except less obvious, less open to refutation by the average well-reasoned person (who is not statistically savvy), and more opaque without appearing to be. Sort of a “knows just enough to be dangerous” kind of situation.

    If someone makes up data, any well-reasoned person can say, “Where did you get that data?” and look it up themselves. But if the answer is something like, “I took data from ___ on the Department of Justice’s website, whitened it — a sophisticated preprocessing step that is common in statistics — and applied a two-sided Pearson’s test which shows with 95% confidence that jail sentences have nothing to do with crime levels.” How is anyone without a statistical background to parse that hash? Most would have to mumble, “Oh, well, are you sure you did the test properly?”, to which the reply is, “I used the R statistical package, which is used world-wide by statisticians, and it takes care of the details for me.” QED

    Your point is well-taken that in order to raise the level of debate, a first step must be taken. Arguing on logical, data-based, statistical grounds is a big step up from throwing personal certainties across the table at each other. Just saying that it can be like a kitchen full of chefs who upgrade their cutting instruments from spoons to light sabers.

  3. I doubt that they are demonstrating that they “accept the concept of arguing from logic and data.” It is far more likely that they have a pre-ordained conclusion and are grasping at whever they can to support that conclusion.

    Using statistics (poorly or well) in support of an arugment is NOT the same thing as building an argument from logic and available data.

  4. I agree entirely – a major difficulty is getting people think seriously about what the message in the data is. Part of the difficulty is exactly the sort of jargon that Wayne pointed out. Terms like ‘confidence intervals’ and ‘bootstrapping’ mean very little to most people and it would be great to be able to do away with this sort of jargon and replace it with descriptions that people could readily relate to. That is obviously far easier said than done, and every subject has its jargon etc…

    I think that people trying to lie with statistics will lie irrespective of whether they can use the right software or not, and even a misguided attempt to put good data analysis techniques into use is better than nothing.

    Losing some of the mystique associated with the jargon would help in getting more people to start thinking that way.

    • I have to agree, there are too few people who think seriously about what the data is saying and rely heavily upon data output from statistical software. Certainly there are conventions and rules in any area and sometimes I feel that individuals within the field hide behind the jargon however, if more people took the time to explain the jargon and mystique then perhaps people would make fewer mistakes with their data.

  5. Speaking of lieing with statistics…

    I suspect a few people here have read Darrell Huff’s “How To Lie With Statistics.” My copy is pretty old, from 1960s.

    In reading the fine new book, Robert Proctor’s Golden Holocaust: Origins of the Cigarette Catastrophe and the Case for Abolition, I was saddened to find:

    p.436-437 “Enriching Statistics”

    “Darrell Huff, author of the wildly popular (and aptly named) How to Lie With Statistics, was paid to testify before Congress in the 1950s and then again in the 1960s, with the assigned task of ridiculing any notion of a cigarette-disease link. On March 22, 1965, Huff testified at hearings on cigarette labeling and advertising, accusing the recent Surgeon General’s report of myriad failures and “fallacies.” Huff peppered his attack with with amusing asides and anecdotes, lampooning spurious correlations like that between the size of Dutch families and the number of storks nesting on rooftops–which proves not that storks bring babies but rather that people with large families tend to have larger houses (which therefore attract more storks.)” (more Huff efforts for tobacco)

    Sigh.

  6. I’m a little surprised by this post. Is it really your position that being irrational is a good thing?

    There is maybe an argument that misapplication of new technology is a necessary stage we have to go through before it gets used widely and properly, but the technology we are talking about here is quite mature already. Certainly it couldn’t mean that we should encourage irrationality when we have the knowledge to point out exactly where procedures are badly flawed.

    I feel that abuse of scientific method does great harm because people are not idiots – when shown a bad argument in favor of a pre-determined thesis that is maybe contrary to one’s common sense, one is inclined to reject it, regardless of one’s technical competence. This surely undermines the acceptance of rational procedure.

    Our species has a long history of irrationality. I’d be interested if anyone can point out evidence for ways this has enhanced the acceptance rationality. To me its a little like saying homeopathy is good because it shows that people recognize the need to treat their symptoms. In my mind, misuse of statistics shows less appreciation of the importance of statistical reasoning than is shows desire to baffle people with pseudoscience.

  7. For some reason, this post reminds me of a commercial I’ve been seeing a lot lately. It is an add for a cell phone showing how it can run some Excel-like program which produces graphs (3d bar charts!). The thing that gets me every time is that they show the hands of the user using a stylus to increase the heights of one of the bars, as if data were just something that you make up like a work of fiction.

    I think this speaks volumes about the difference between the professional and lay views of data. As professionals we expect that there is a clearly documented chain back to the original observations; lay people think they are just part of a story, and could be made up. As Wayne points out, the problem gets worse when that chain is poorly described, and at a certain point I usually find myself simply taking the author’s word for it rather than say checking the sampling methodology used in the Current Population Survey (or rather, I’m relying on the reputation of the Census Bureau).

    I think that to do statistics properly, you need a certain sense of skeptical inquiry: you need to be prepared to let the data tell you something other than what you were looking for. I’m not sure how you teach that, especially to somebody whose salary depends on a particular proposition being correct.

  8. Pingback: “How to Lie with Statistics” guy worked for the tobacco industry to mock studies of the risks of smoking statistics « Statistical Modeling, Causal Inference, and Social Science

Comments are closed.