The Venn Diagram Challenge which started with this entry has spurred exciting discussions at Junk Charts, EagerEyes.org, and at Perceptual edge. So I thought I will do my best to put them together in one piece.
Outcomes people created can be divided into 2 classes, first group dealt with the problem of expressing the “3-way Venn diagram of percentage with different base frequency”. Second group went a little deeper to figure out the better way to express what the paper is trying to express in a graphical way. Our ultimate goal is the second one, however, first problem is it’s self a interesting challenge and thus I will deal with them separately. ( Second group will be dealt with in the Venn Diagram Challenge Summary 2 which should come shortly after this article. )
Venn diagram converted into a table:
(For background you can look at the previous posts original entry, on Antony Unwin’s Mosaic chart, and Stack Lee’s bar chart.)
How to express 3-way Venn diagram of percentage with different base frequency better
Here are 4 graphs that I am aware of that falls in this category:
It is always amazing to see how people make cool graphics out of same data.
There were 4 things ( percentage, base frequency, structure, possible trend, and maybe more) or maybe more, from the Venn diagram that could have been expressed graphically. When we dissect the above graphs by the 4 things noted above, result is the following:
So the biggest differences between the graphs are the way in which the structure is expressed. Another point to note is how the different graphs addressed the issue of the base frequency. It’s hard to say which one’s the best because they all have points which I like. For example, to express percentage Antony’s Mosaic chart seems the most suitable since it is clear that it is showing a proportion by having gray area with the green area on the bar. To express base frequency, again I like Antony’s Mosaic chart since it gives heavy weight on the ones with more samples, which are the results that we should focus more on. As for expressing structure, it is tough call between Patrick and Robert, I personally like them both in a different way. Stack’s bar chart seems very good at comparing between Autism and Autism Spectrum which I should have put in the chart.
Figure 1. Prevalence of best-estimate diagnosis at age 9 years with frequency of diagnostic combinations at age 2 years expressed as area of circle. Vertical line show plus minus 2 standard error bounds based on the implicit binomial distribution with Bayesian correction (*1). Upper graph represents the case where clinician is yes and bottom is for clinician no. PL-ADOS stands for Pre-Linguistic Autism Diagnostic Observation Schedule; ADI-R stands for Autism Diagnostic Interview–Revised.
Figure 2. Prevalence of best-estimate diagnosis at age 9 years with frequency of diagnostic combinations at age 2 years expressed as area of circle. Autism. Blue line represents the Pre-Linguistic Autism Diagnostic Observation Schedule (PL-ADOS); Green line represents Autism Diagnostic Interview–Revised (ADI-R); and Red line represents Clinician.
If we do the same analysis we get this:
For figure 1 you can see the trend easily, with the cost of loosing the overall structure. Alternatively figure 2 keeps the structure, but it comes with the cost of visual complexity. Area of circle is not my favorite way to express the base frequency, but it does a good job of showing which points are more important without interfering with the trend line. Also this figure is generalizable to more complex Venn Diagrams.
What do you think? We appreciate your constructive comments!
( If you have charts that was not mentioned in this article and would like to be acknowledged give us a comment. Also those who tacked the issue of sensitivity and specificity, I didn’t forget you. You will be mentioned in Venn Diagram Challenge Summary 2. …to be continued…)
(*1) Calculation of standard error with Bayesian correction is done as: