False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should [...]
“Most Popular Infographics you can find around the web”by designer and illustrator Alberto Antoniazzi.
Speaking of open data and google tools, see this post from Revolution R: How to use a Google Spreadsheet as data in R.
Tools worth knowing about: Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase. A recent discussion on the Polmeth list about the ANES Cumulative File is a setting where I think [...]
In light of the recent article about drug-target research and replication (Andrew blogged it here) and l’affaire Potti, I have mentioned the “Forensic Bioinformatics” paper (Baggerly & Coombes 2009) to several colleagues in passing this week. I have concluded that it has not gotten the attention it deserves, though it has been discussed on this [...]
As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples [...]
The new R environment RStudio looks really great, especially for users new to R. In teaching, these are often people new to programming anything, much less statistical models. The R GUIs were different on each platform, with (sometimes modal) windows appearing and disappearing and no unified design. RStudio fixes that and has already found a [...]