Reproducible research

A new paper in Annals of Applied Statistics by Keith Baggerly and Kevin Coombes discusses Forensic bioinformatics. I heard Keith talk about this at MCMSki3. Basically, the Baggerly and Coombes looked at a few bioinformatics papers, tried to reproduce some statistical analyses, and failed. In failing, Baggerly and Coombes realized the authors of those papers had not done what they said they had done. So Baggerly and Coombes reverse engineered what was actually done with astounding results, e.g. training data with sensitive/resistant labels reversed and bizarre probability laws [P(A,B) = max(P(A),P(B)). The authors suggest (and this blog post suggests) that all analyses should be able to be run immediately by reviewers to verify results. One way of doing this is to use Sweave which is a combination R/LaTeX document that 1) runs all R analyses and then 2) compiles the LaTeX document containing all the R output. I've used Sweave to create a 2-day short course on R which I found very valuable, but yet I cannot seem to incorporate it into standard practice when writing papers. We'll see if that changes.

blog comments powered by Disqus

Published

19 September 2011

Reproducible research

Published

Tags