Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.

I had an email exchange with someone the other day. He had a paper with some graphs that I found hard to read, and he replied by telling me about the software he used to make the graphs. It was fine software, but the graphs were, nonetheless, unreadable.

Which made me realize that people are thinking about graphics software the wrong way. People are thinking that the software makes the graph for you. But that’s not quite right. The software allows you to make a graph for yourself.

Think of graphics software like a hammer. A hammer won’t drive in a nail for you. But if you have a nail and you know where to put it, you can use the hammer to drive in the nail yourself.

This is what I told my correspondent:

Writing takes thought. You can’t just plug your results into a computer program and hope to have readable, useful paragraphs.
Similarly, graphics takes thought. You can’t just plug your results into a graphics program and hope to have readable, useful graphs.

29 thoughts on “Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.

  1. Absolutely. You can have all the building materials and tools in the world — and know how to use them properly — but still not have a clue how to build a house.

  2. Today’s courtroom advocate spends as much time on his/her graphics as Yesterday’s did on PowerPoint slides full of statistics. And for the same purpose. To advance the narrative by distilling complexity down to a salient, comprehensible, and thus apparently obvious, essence. Post-verdict surveys of jurors have repeatedly demonstrated poor recollection of testimony and disputed data but very good recollections like the following: “that graph showing that every time Compamy P bought more valves it’s profit jumped until they bought Company D’s valves; and that’s when their profits fell off the cliff”.

    Given the temptation to recast findings via graphical ad hocery, maybe graphics should be pre-registered too.

  3. You can’t just plug your results into a graphics program and hope to have readable, useful graphs.

    People do this all the time and end up with those “dynamite” charts (its funny that base R refuses to have a function supporting these). When questioned they will usually have no idea about the underlying data. Eg, if there was a large error bar it could be due to an outlier, bimodal distribution, etc. The data is never inspected in disaggregated form. I’ve seen this even with as few as 3 data points.

    This can be put to use though, the presence of a dynamite chart is indicative of untrustworthy data analysis practices.

    • “Isn’t there a lot of talk recently about software that automates model building & selection for you?”

      I hope not! Sensible model building depends on the context. I find it hard to imagine software that would ask questions about the context and be able to respond to a wide variety of contexts. I am guessing that any such software that currently exists just builds models based on the data — which is likely to result in over-fitting.

      • Is there a significant difference between software that automatically spits out a model and someone using pre-packaged models in R or Python without much understanding? If someone picks up Introduction to Statistical Learning (great book) they can be training Neural Nets by the afternoon, all in R! Access to relatively complicated modeling procedures is already widespread; I’m not sure that software that does it all up front is that much more of a concern then access to these procedures via software lots of us already endorse (R, Python, etc.).

        • I agree and think this is an important issue. I often see opinions expressed that suggest that it is bad for people to use things they don’t thoroughly understand (i.e., the derivation of methods, assumptions, limitations, etc.). It is certainly dangerous, but I don’t think it is bad. After all, nobody is saying that we should not use word processors unless we can program our own typesetting and editing ourselves. If someone does not know grammar, a word processor will make them more “productive” and perhaps more dangerous, but the lack of grammar knowledge cannot be blamed on the software, nor would I want to see the use of word processors restricted to only those who understand grammar.

          I think the use of software is a red herring issue. Lack of understanding, poor incentives, and poor behaviors are the real issues. Easy to use software compounds these fundamental problems, but I think making the software a target is misguided. I appreciate software that makes things easier to do – and I don’t like when it claims to make easier what I know cannot and should not be made easier. For data visualization, I do not like suggested display types – good displays depend too much on contextual knowledge of the questions and the data. I do like software that makes it easier for me to change aggregation methods (mean, median, sum, quantiles, etc.). So, where to draw the line?

          I’m tempted to paraphrase a saying which I detest: “Software does not kill displays, people do.” I think the target of people’s dismay should not be the software but should be the dismal state of quantitative reasoning and ethics of some practitioners.

        • “I think the use of software is a red herring issue. Lack of understanding, poor incentives, and poor behaviors are the real issues. Easy to use software compounds these fundamental problems, but I think making the software a target is misguided. I appreciate software that makes things easier to do – and I don’t like when it claims to make easier what I know cannot and should not be made easier. […] I think the target of people’s dismay should not be the software but should be the dismal state of quantitative reasoning and ethics of some practitioners.”

          +++

        • I’ve used it to find a functional form to fit to some data (turned out to be sigmoidal) You could also tell by looking at the data… but this seemed to allow a reproducible methodology that yielded the same conclusion. Ie, by tweaking the weights I was able to encode the balancing of complexity/accuracy I was performing mentally.

          I could also see using symbolic regression for feature generation, but haven’t ever done that. Ie, feature4 = feature1 + feature2)/feature3.

        • +1

          I had played a lot with it. Especially disappointed when it could never even get back a artificial generating function with no noise.

    • It is human nature to do things the easiest way available to them (call it laziness if you like, economists call it rational behavior, psychologists might say our brain is wired to work that way,….). Good software designers should anticipate that and make the defaults as smart as they can – or disable defaults if that is not possible. I might even argue that the defaults are the most important part of the software (at least they are more important than is usually realized). It seems silly to try to fight human nature and keep complaining when people do what comes naturally to them. I don’t want to overstate this point – I am not excusing someone just using defaults and never trying anything else. But I do think that some software is better than others when it comes to what they make easy for people to do – and some do much worse.

        • -1. In the nicest possible way. :)

          Software is never smart. “Wiser choice” is at best “what our survey showed was the best setting for making sales” and at worse minor committee work. That’s true even for zero cash cost software, too.

          Okay, maybe, if I dream, some computer vision software will be able to identify for you the parts of a given graph that the average reader will notice and miss. But, still, there is nothing average about the target readers of most interesting graphics.

        • It sounds as though you are thinking of business applications.

          I was thinking more of scientific applications (and in general, not just graphics software). Developing such “smart” software would require serious study of how people use the base software, to identify scientifically/statistically poor choices (e.g., using the default when it is not a good choice) and giving messages such as “Are you sure you want to use the default? Your data suggest that choices b and c might be better,” or “This choice of option assumes that these conditions are met. Be sure to check that this is the case before using this option.”

  4. This is like when you see qualitative researchers describing their methods as “I used AtlasTI” or “I used MaxQDA” or whatever. Sure enough, but what did you do with the software? As a reader, I don’t actually care that much what tool you used, but how you arrived at your results.

    • One of my frequent gripes along these lines is “We used SAS PROC MIXED”, with no further details, such as which factors were considered fixed, which were random, and why those choices were made.

Leave a Reply

Your email address will not be published. Required fields are marked *