Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.

Posted on November 18, 2017 9:32 AM by Andrew

I had an email exchange with someone the other day. He had a paper with some graphs that I found hard to read, and he replied by telling me about the software he used to make the graphs. It was fine software, but the graphs were, nonetheless, unreadable.

Which made me realize that people are thinking about graphics software the wrong way. People are thinking that the software makes the graph for you. But that’s not quite right. The software allows you to make a graph for yourself.

Think of graphics software like a hammer. A hammer won’t drive in a nail for you. But if you have a nail and you know where to put it, you can use the hammer to drive in the nail yourself.

This is what I told my correspondent:

Writing takes thought. You can’t just plug your results into a computer program and hope to have readable, useful paragraphs.
Similarly, graphics takes thought. You can’t just plug your results into a graphics program and hope to have readable, useful graphs.

29 thoughts on “Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.”

Sameera Daniels on November 18, 2017 9:51 AM at 9:51 am said:

no doubt

Reply ↓
Joyce Robbins on November 18, 2017 9:51 AM at 9:51 am said:

Absolutely. You can have all the building materials and tools in the world — and know how to use them properly — but still not have a clue how to build a house.

Reply ↓
Nathan Taylor on November 18, 2017 10:12 AM at 10:12 am said:

Not sure if you saw this Eugene Wei post on his time at Amazon Analtics, and the art of a making good graph. It’s a bit long, but I thought quite good. In a similar vein to this post.
http://www.eugenewei.com/blog/2017/11/13/remove-the-legend

Reply ↓
Sameera Daniels on November 18, 2017 11:02 AM at 11:02 am said:

Very good.

Reply ↓
Thanatos Savehn on November 18, 2017 11:38 AM at 11:38 am said:

Today’s courtroom advocate spends as much time on his/her graphics as Yesterday’s did on PowerPoint slides full of statistics. And for the same purpose. To advance the narrative by distilling complexity down to a salient, comprehensible, and thus apparently obvious, essence. Post-verdict surveys of jurors have repeatedly demonstrated poor recollection of testimony and disputed data but very good recollections like the following: “that graph showing that every time Compamy P bought more valves it’s profit jumped until they bought Company D’s valves; and that’s when their profits fell off the cliff”.

Given the temptation to recast findings via graphical ad hocery, maybe graphics should be pre-registered too.

Reply ↓
- Martha (Smith) on November 18, 2017 4:21 PM at 4:21 pm said:
  
  +1
  
  Reply ↓
Anoneuoid on November 18, 2017 11:49 AM at 11:49 am said:

You can’t just plug your results into a graphics program and hope to have readable, useful graphs.

People do this all the time and end up with those “dynamite” charts (its funny that base R refuses to have a function supporting these). When questioned they will usually have no idea about the underlying data. Eg, if there was a large error bar it could be due to an outlier, bimodal distribution, etc. The data is never inspected in disaggregated form. I’ve seen this even with as few as 3 data points.

This can be put to use though, the presence of a dynamite chart is indicative of untrustworthy data analysis practices.

Reply ↓
Rahul on November 18, 2017 11:41 PM at 11:41 pm said:

Do we need a similar corollary for model-building?

Isn’t there a lot of talk recently about software that automates model building & selection for you?

Reply ↓
- Martha (Smith) on November 19, 2017 12:41 AM at 12:41 am said:
  
  “Isn’t there a lot of talk recently about software that automates model building & selection for you?”
  
  I hope not! Sensible model building depends on the context. I find it hard to imagine software that would ask questions about the context and be able to respond to a wide variety of contexts. I am guessing that any such software that currently exists just builds models based on the data — which is likely to result in over-fitting.
  
  Reply ↓
  - Sameera Daniels on November 19, 2017 7:38 AM at 7:38 am said:
    
    That’s kinda scary. We all have to become more adept at examining these softwares.
    
    Reply ↓
  - Allan on November 19, 2017 11:14 AM at 11:14 am said:
    
    Is there a significant difference between software that automatically spits out a model and someone using pre-packaged models in R or Python without much understanding? If someone picks up Introduction to Statistical Learning (great book) they can be training Neural Nets by the afternoon, all in R! Access to relatively complicated modeling procedures is already widespread; I’m not sure that software that does it all up front is that much more of a concern then access to these procedures via software lots of us already endorse (R, Python, etc.).
    
    Reply ↓
    - Dale Lehman on November 19, 2017 1:02 PM at 1:02 pm said:
      
      I agree and think this is an important issue. I often see opinions expressed that suggest that it is bad for people to use things they don’t thoroughly understand (i.e., the derivation of methods, assumptions, limitations, etc.). It is certainly dangerous, but I don’t think it is bad. After all, nobody is saying that we should not use word processors unless we can program our own typesetting and editing ourselves. If someone does not know grammar, a word processor will make them more “productive” and perhaps more dangerous, but the lack of grammar knowledge cannot be blamed on the software, nor would I want to see the use of word processors restricted to only those who understand grammar.
      
      I think the use of software is a red herring issue. Lack of understanding, poor incentives, and poor behaviors are the real issues. Easy to use software compounds these fundamental problems, but I think making the software a target is misguided. I appreciate software that makes things easier to do – and I don’t like when it claims to make easier what I know cannot and should not be made easier. For data visualization, I do not like suggested display types – good displays depend too much on contextual knowledge of the questions and the data. I do like software that makes it easier for me to change aggregation methods (mean, median, sum, quantiles, etc.). So, where to draw the line?
      
      I’m tempted to paraphrase a saying which I detest: “Software does not kill displays, people do.” I think the target of people’s dismay should not be the software but should be the dismal state of quantitative reasoning and ethics of some practitioners.
    - Kyle MacDonald on November 19, 2017 4:46 PM at 4:46 pm said:
      
      “I think the use of software is a red herring issue. Lack of understanding, poor incentives, and poor behaviors are the real issues. Easy to use software compounds these fundamental problems, but I think making the software a target is misguided. I appreciate software that makes things easier to do – and I don’t like when it claims to make easier what I know cannot and should not be made easier. […] I think the target of people’s dismay should not be the software but should be the dismal state of quantitative reasoning and ethics of some practitioners.”
      
      +++
    - Allan on November 19, 2017 10:41 PM at 10:41 pm said:
      
      Absolutely agree. I’ll add another + to the extract Kyle highlighted.
  - Bill Harris on November 20, 2017 11:44 AM at 11:44 am said:
    
    https://en.wikipedia.org/wiki/Eureqa
    
    Reply ↓
    - Bill Harris on November 20, 2017 11:46 AM at 11:46 am said:
      
      Out of curiousity, I tried it once when it was free. I never figured out how to make it do anything useful.
    - Anoneuoid on November 20, 2017 12:15 PM at 12:15 pm said:
      
      I’ve used it to find a functional form to fit to some data (turned out to be sigmoidal) You could also tell by looking at the data… but this seemed to allow a reproducible methodology that yielded the same conclusion. Ie, by tweaking the weights I was able to encode the balancing of complexity/accuracy I was performing mentally.
      
      I could also see using symbolic regression for feature generation, but haven’t ever done that. Ie, feature4 = feature1 + feature2)/feature3.
    - Rahul on November 21, 2017 4:26 AM at 4:26 am said:
      
      +1
      
      I had played a lot with it. Especially disappointed when it could never even get back a artificial generating function with no noise.
    - Andrew on November 20, 2017 12:29 PM at 12:29 pm said:
      
      Bill:
      
      I wrote about Eureqa a few yrs ago.
Georgette Asherman on November 19, 2017 3:10 PM at 3:10 pm said:

Graphic software doesn’t make graphs but most users fall back on the default settings such as offset for 0 from the y axis or horizontal gridlines. These defaults become how the clients perceive their data, for good or bad.

Reply ↓
- Martha (Smith) on November 19, 2017 4:16 PM at 4:16 pm said:
  
  +1
  
  Reply ↓
- Dale Lehman on November 19, 2017 4:25 PM at 4:25 pm said:
  
  It is human nature to do things the easiest way available to them (call it laziness if you like, economists call it rational behavior, psychologists might say our brain is wired to work that way,….). Good software designers should anticipate that and make the defaults as smart as they can – or disable defaults if that is not possible. I might even argue that the defaults are the most important part of the software (at least they are more important than is usually realized). It seems silly to try to fight human nature and keep complaining when people do what comes naturally to them. I don’t want to overstate this point – I am not excusing someone just using defaults and never trying anything else. But I do think that some software is better than others when it comes to what they make easy for people to do – and some do much worse.
  
  Reply ↓
  - Martha (Smith) on November 19, 2017 5:26 PM at 5:26 pm said:
    
    Maybe we need to start thinking about “smart software” that will help nudge users at least sometime to make wiser choices.
    
    Reply ↓
    - Dzhaughn on November 19, 2017 8:44 PM at 8:44 pm said:
      
      -1. In the nicest possible way. :)
      
      Software is never smart. “Wiser choice” is at best “what our survey showed was the best setting for making sales” and at worse minor committee work. That’s true even for zero cash cost software, too.
      
      Okay, maybe, if I dream, some computer vision software will be able to identify for you the parts of a given graph that the average reader will notice and miss. But, still, there is nothing average about the target readers of most interesting graphics.
    - Martha (Smith) on November 19, 2017 11:39 PM at 11:39 pm said:
      
      It sounds as though you are thinking of business applications.
      
      I was thinking more of scientific applications (and in general, not just graphics software). Developing such “smart” software would require serious study of how people use the base software, to identify scientifically/statistically poor choices (e.g., using the default when it is not a good choice) and giving messages such as “Are you sure you want to use the default? Your data suggest that choices b and c might be better,” or “This choice of option assumes that these conditions are met. Be sure to check that this is the case before using this option.”
Bill Harris on November 22, 2017 11:16 AM at 11:16 am said:

Have you seen ViSta (http://www.visualstats.org/)? It offers workflow management built on top of XLISP-STAT. I used it a bit when I used to use XLISP-STAT, and I thought it had promise. It looks like you can still use it, but the last update appears to have been in 2014.

Reply ↓
- Bill Harris on November 22, 2017 11:17 AM at 11:17 am said:
  
  Oops: my comment was supposed to be a reply to Martha’s comment.
  
  Reply ↓
Claire on November 23, 2017 6:21 PM at 6:21 pm said:

This is like when you see qualitative researchers describing their methods as “I used AtlasTI” or “I used MaxQDA” or whatever. Sure enough, but what did you do with the software? As a reader, I don’t actually care that much what tool you used, but how you arrived at your results.

Reply ↓
- Martha (Smith) on November 23, 2017 10:37 PM at 10:37 pm said:
  
  One of my frequent gripes along these lines is “We used SAS PROC MIXED”, with no further details, such as which factors were considered fixed, which were random, and why those choices were made.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.

29 thoughts on “Graphics software is not a tool that makes your graphs for you. Graphics software is a tool that allows you to make your graphs.”

Leave a Reply Cancel reply