Interview with Michigan State's Titus Brown

This post originally appeared on the Software Carpentry website.

Today's interview is with Professor Titus Brown, from Michigan State University.

Tell us a bit about your organization and its goals.

Michigan State is one of the Big Ten and hence a really big place. We do a lot of research, in particular, and a lot of that research is in biology. The Gene Expression in Disease and Development focus group is a collection of molecular biologists at MSU that are interested in gene expression and methods (experimental, genomic, and bioinformatic) for understanding it.

Tell us a bit about the software your group uses.

Most molecular biologists either use pre-packaged analysis tools, or nothing at all. Even the local bioinformaticians have generally not picked up on version control or anything more complex than Perl analysis scripts.

Tell us a bit about what software your group develops.

My lab develops software for our own research, as well as working on reusable libraries for others to use, and eventually we hope to build GUIs or Web applications for even more general use. We're interested in soup-to-nuts—basic sequence analysis all the way through to database curation and genome-scale visualization.

Who are you hoping Software Carpentry will help?

Any biology graduate student that needs to do anything unoriginal, computationally speaking. Computational students should find it particularly useful: someone who has been through the normal CS curriculum, for example, but has never learned about SQL databases, version control, Web services, testing, etc. We get a pretty wide range of backgrounds in our interdisciplinary grad students, so it is impossible to identify a single training track that will serve even a majority. Hopefully the SWC material can backstop the material we are already developing on the subject of "being effective at computation."

How do you hope the course will help them?

Like many other biology research institutions, we're finding ourselves overwhelmed with genome-scale data; all the new sequencing platforms (along with tandem mass spec, and a host of other systems) deliver stunning amounts of data. We are not well prepared to deal with the data, and the old molecular biologist standby of loading everything into Excel doesn't scale at all. So molecular biologists are starting to have to learn to program in order to do pretty much anything with this data. But while we at least have courses that teach people how to program, we have basically no computational science curriculum, and what we do have is targeted less at being effective than at being minimally capable in a given field.

I'm not a fan of big ideas. I would just like students to have the ability to improve their general scientific computation skills iteratively, without having to go through a class.

How will you tell what impact the course has had?

I will be happy the day one of my own students casually (and correctly) uses a technique that s/he could only have learned from the SWC material. I will be thrilled when somebody else's computational student name-drops SWC as the source of a technique that sped up their research. And I will be ecstatic when a previously purely experimental students tells me how great SWC has been for helping them learn how to do computational science better.

We don't have any systematic way of assessing the impact of the course, however.