Skip to content
 

DIY data analysis: three fun examples

I recently came across some links showing readers how to make their own data analysis and graphics from scratch. This is great stuff–spreading power tools to the masses and all that.

From Nathan Yau: How to Make a US County Thematic Map Using Free Tools and How to Make an Interactive Area Graph with Flare. I don’t actually think the interactive area graphs are so great–they work with the Baby Name Wizard but to me they don’t do much in the example that Nathan shows–but, that doesn’t really matter, what’s cool here is that he’s showing us all exactly how to do it. This stuff is gonna put us statistical graphics experts out of business^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^Ha great service.

And Chris Masse points me to these instructions from blogger Iowahawk on downloading and analyzing a historical climate dataset. Good stuff.

5 Comments

  1. Peter says:

    You can also find instructions for doing the county thematic map in R instead of Python, here:
    http://blog.revolution-computing.com/2009/11/chor

  2. Ben Lauderdale says:

    A while ago I created a file that maps all of the regions in the county map in library(maps) onto county FIPS codes. I matched on name and then fixed the errors by hand. There could be errors though, I was more interested in getting rid of the white blotches in my map than in getting every last county right: CSV

  3. Phil says:

    The Iowahawk write-up is quite good. It shows why the method he describes, which is what Mann et al. used to make their original "hockey stick" climate plot, is not good, although they missed another chance to pile on: they could have shown that the method is prone to making "hockey stick" shapes even from data with no underlying upward tendency. So, good writeup.

    But they really should have told people that people have re-analyzed the data many different ways, and that even good methods without a "hockey stick" bias still show a very rapid temperature increase in recent decades. The increase is real. What is unknown is the northern hemisphere (and global) mean temperature between 1000-1600. The temperature proxy data from that period are really uncertain, so nobody really knows what the temperature was back then. As you can imagine, lots of people have looked at the paleoclimate data in the past decade, and there are several journal articles about it. People interested in this subject should take a look at some of the more recent ones.

  4. Willem says:

    Does anybody know of a way to do this in R and a whole lot easier than this?

    If not, I should come forward with my code. I've built this two years ago including a nifty Google Earth output for my home country The Netherlands. Including a US County map shouldn't be a lot of work.

  5. Willem says:

    Actually, Peter has said it all. Perhaps they should ask people to make a KML-file next and someone will redo my work in 8 lines of code instead of my masses!