Prime Numbers, Biologists, and Data Visualization

This post originally appeared on the Software Carpentry website.

A couple of days ago, Titus Brown posted a question from Randy Olson: are there relevant, non-abstract problems (for biologists) that you can solve with minimal programming skills? I.e., what's the biology equivalent to traditional intro CS problems like "calculate the primes".

The answer has three parts:

  1. Calculating primes, finding the longest line in a file, etc., are lousy introductory problems because they're completely valueless to most of the intended audience. You have to already believe you really, really want to learn how to program, and be good at delayed gratification, to get through them to the good stuff. (This isn't just true in computing: see the discussion in How Learning Works of the impact of motivation on learning.)
  2. Guzdial, Ericson, et al's work at Georgia Tech showed pretty conclusively that a media-first approach to computing has better outcomes, i.e., if people are manipulating images (and audio and video) right from day 1, they'll retain more of what they learn, and more students will stay in the program. (But note: you probably can't use an industrial-strength image manipulation library like PIL for intro teaching, at least not directly: you need something that assumes less background knowledge both in domain and skills.)
  3. Robbins, Senseman, and Pate have been piloting a "visualization first" programming course for biologists (see this paper if you can—apologies for the paywall). While it's early days, it seems that this should have the benefits of "media first" for scientists. For example, a programming exercise could be "draw the curves in red if the end point is less than the starting point, and in blue otherwise".

We haven't reorganized our intro material around this idea yet, partly because of inertia, but partly because of the installation headaches of getting visualization working on N platforms in a two-day workshop. I'm very keen to try it out, though, particularly if the IPython Notebook really does make simple visualization simple to do on all major platforms.

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github