Quality Is Free - Getting There Isn't

This post originally appeared on the Software Carpentry website.

Worried about the rising tide of retractions, Nature Biotechnology recently announced that, "Its peer reviewers will now be asked to assess the availability of documentation and algorithms used in computational analyses, not just the description of the work. The journal is also exploring whether peer reviewers can test complex code..." That's a welcome step in theory, but I worry about how it will play out in practice. Scientists already complain about how much time they spend reviewing papers: reviewing code as well will take even more time, particularly if:

they haven't been taught how to do reviews, and
the code hasn't been written with review in mind.

As we found in the two-part study of code review by and for scientists that we did in collaboration with PLOS and the Mozilla Science Lab, most groups and software currently fail both of these tests: they don't know what to look for in code reviews because they've never taken part in one, and because they don't know what other people will be looking for, they don't make things easy to find and understand in their own work. I therefore believe that asking the average scientist to review code will take more time than it delivers value.

But this can be fixed. Teaching people how to write readable code is a first step, and that's always been part of Software Carpentry's mission. Showing them what code reviews look like—e.g., doing one live in front of the class—hasn't been part of our workshops, but it could be. And if major journals are going to start asking (or, one day, requiring) code reviews as part of the publication process, then teaching scientists how to do them has to move up our list of priorities.

Back in the 1970s, American business gurus started saying, "Quality is free." What they meant was that it pays for itself: the mistakes you don't have to fix more than pay for the cost of preventing them. Study after study of software developers has shown that the same is true for code review (see Jason Cohen's chapter in Making Software for a summary). However, where American CEOs stumbled, and where I believe journals like Nature Biotechnology will stumble, is that getting from here to quality without training most definitely isn't free. The question now is whether people will realize that and invest in training, or will the frustration caused by its lack lead them to believe that things like code review are more trouble than they're worth?