Another Reason to Care About Provenance
This post originally appeared on the Software Carpentry website.
A vice president at ETH is resigning today because of accusations of fraud. A key lab notebook is missing, retractions have been retracted, and according to ETH President Ralph Eichler, "there is now no legal way of finding out for sure who was responsible for the falsifications."
Question: if someone accused you of falsifying results, how well and how easily could you defend yourself? How long would it take you to pull together all the notes, data, and programs you used three years ago to produce the paper being challenged? And would you career survive? Even if you "won", you would probably lose weeks or months of research time.
I think this is one of the strongest arguments in favor of using data provenance systems. Young scientists' lives are difficult enough (see for example Peter Lawrence's recent PLoS Biology article); an accusation of fraud, well-intentioned or otherwise, could effectively destroy someone's career even if it was unfounded. Using computers to create an audit trail is just good insurance...