Home> Blog> Sarah Supp: What I've Learned

Sarah Supp: What I've Learned

This post originally appeared on the Software Carpentry website.

I have to admit, that when I first decided I was going to learn programming skills, it was for academic survival. I was decidedly not excited about taking a course. As a graduate student, I was lucky that Utah State University had a Programming for Biologists class, taught by Ethan White. Most universities still lack the infrastructure to teach these skills to scientists, outside of signing up for courses in the computer science department. Within the first week, I learned that programming is really like playing a series of logic games, and that it could actually be quite fun. Aside from practical skills, the most important thing I learned was not to be intimidated by computational problems.

Since then, I started volunteering with Software Carpentry by helping at several bootcamps held at Utah State, and leading bootcamps at Washington University in St. Louis and for the Boston Women in Science and Engineering group. So far, I have really enjoyed these opportunities, and all the amazing people who I have been connected to, including the rapidly growing community of scientist-programmers. And teaching these workshops challenges me to remain up-to-date on my skills and to keep learning new things. As someone who never thought that I would be interested in programming, I especially like teaching novices, watching as programming becomes accessible instead of impossible.

I take a broad approach to my ecological research, combining field work and large datasets with a variety of statistical approaches. This absolutely means that being able to learn and apply new computational toolkits is necessary for finishing projects. In ecology, R is the dominant language, and is what I use for most of my code. But I found that code style and organizational concepts from learning Python translated directly to my R code, and have helped me to keep my code clean and readable. Developing new datasets and working with existing data is much more tractable for me now, because I know SQL and the current best practices for database management. Understanding howto code, instead of using a GUI that keeps everything hidden in a "black box", demystifies scientific analysis, and ultimately, allows me to do better science by being in complete control of the process.

The most important part to my daily workflow is keeping each of my projects under version control, which I do primarily via GitHub (https://github.com/sarahsupp). This means that if my computer gives me the black screen of death, if there is a fire in my office, or if I have the flu and under the influence of NyQuil decide that I'm a master programmer and completely break my code (true story) – well, none of that matters. Because I am a Time Lord (except I can only go backwards in time).

My experience, in short:

  • Having computational skills has enabled me to make my code better.
  • Part of good code is keeping it under version control. I use Git.
  • Having not-horrible code gave me confidence to share it in public GitHub repositories.
  • Having code on GitHub made it easy to publish, when the journal required my code to be available (https://github.com/weecology/portal-experimental-macroeco).
  • Having my code on GitHub recently allowed me to be part of a great discussion on my research! What good is science without communication anyway?

Yes, it is possible to succeed in science without being a computer scientist. I'm not one. But obtaining even a basic level of computational literacy can allow researchers to communicate better with collaborators, speed the rate at which science is done, and can open many doors for new research projects, new collaborators, or even new career paths.

I tweet about ecology, programming, and being a postdoc @srsupp, and details on my research are updated on my website and my GitHub repositories. The Weecology group recently wrote a paper on how to share your data, which is a great resource for basic database organization and management.