In response to this query on how to reexpress Venn-diagram data graphically, Antony sends along this picture:
The Autism data are surprisingly clearly structured. I haven’t included the basic barcharts for each variable, though they provide useful information towards understanding the data.
Since this is a categorical dataset with five variables, some variation of a mosaicplot should be a first choice for displaying the variables in combination. I calculated how many were diagnosed and how many not from the prevalence percentages. I then drew doubledecker plots weighted by these numbers with the diagnosed selected.
In the top figure Groups A and B are aggregated and the seven possible combinations of the three tests are plotted in the nested ordering of Clinician, ADI-R and PL-ADOS. The increasing prevalence with this ordering stands out (ie that Clinician tests have higher prevalence rates, and within those then ADI-R). The sizes of the different groups are also emphasised.
In the lower figure Groups A and B are separated by splitting each of the 7 bars in the top figure accordingly. Here it is obvious that there is very little difference between A and B in terms of prevalence with any of the combinations of tests.
The diagrams were drawn with Heike Hofmann’s MANET software. It includes a line for the empty zero combination (far left of both plots). The diagrams could also have been drawn with Martin Theus’s MONDRIAN software, which runs on all platforms, while MANET only runs on the Mac, but then the labelling beneath the plots would have had to have been added. For a publication the labelling would be further refined.
This graph is indeed pretty, and the bars do a good job of conveying that the ultimate data are counts. Still, I think I’d prefer a set of line graphs. I just find these mosaic plots hard to read. Maybe Masanao and I can try the line plots and then write a joint paper with Antony and Igor comparing the different representations.