Home> Blog> Python for Biologists

Python for Biologists

This post originally appeared on the Software Carpentry website.

Programmers, at least as found in scientific research institutions, generally fall into two groups. There are people with a computer science background, for whom programming is a way of Thinking About The World, and people with a scientific background, for whom programming is a way of Getting Things Done.

As well as applying to the practice of programming, these differences also apply to its teaching. Computer science folks tend to be happiest discussing languages and features in very abstract terms, and looking at equally abstract examples (witness, for example, the ubiquity of calculating Fibonacci numbers when discussing the difference between iterative and recursive processes). Scientific folks (my day-to-day dealings are mostly with biologists) generally prefer concrete examples that are more tailored to their interests. For example, when I introduce recursion to biologists I use the example of traversing a taxonomic tree.

These two styles of teaching and learning both have their place, but the majority of tutorials, courses and books (online and offline) seem to me to be slanted more towards the abstract and general. This is entirely to be expected: after all, computer scientists are the people who know the languages most thoroughly and are therefore the natural choice to guide newcomers through them. And of course, to write material aimed at learners with a specific background is to deliberately limit one's appeal.

Nevertheless, I think that there is a need for what one might call "domain-specific" guides to programming. In my experience, writing for a specific audience pays off in a number of ways. One can pick examples that will be relevant to the audience's experience—examples which not only illustrate a particular point or language feature, but also provide the motivation for learning it. And one can choose to teach the features of the language that are most likely to be useful. Case in point: most people would consider regular expressions to be a fairly advanced feature of most languages, but they are so useful in biology that I teach them as part of all my introductory programming courses.

With these points in mind, I spent a good chunk of the summer writing a free introductory programming course for biologists. I picked Python as the language to use, mostly due to its high take-up among biological researchers, and put it online at pythonforbiologists.com. I've been extremely pleased with the reception: the site has had 17,000 unique visitors in the 3 months it's been online and I've had lots of positive feedback. One of the most exciting things has been the numbers of people writing with suggestions for improvements, ranging from typos to requests for entirely new topics to be covered.

I think that these kind of niche programming tutorials and courses have a bright future and I expect them to play an increasing role as programming becomes an increasingly desirable transferable skill—not just for those of us involved in science, but for anyone in a data-driven industry. There's a lot of value to be added to the already-excellent on-line programming resources by tailoring them to specific fields. I hope to see more of them, and if anybody reading this is interested in creating something similar, I'd love to hear from you and would be happy to give what help I can.