We teach foundational coding and data science skills to researchers worldwide.

My Favorite Tool is Git

This post originally appeared on the Data Carpentry website

I love Git and GitHub.

I only use them for work I care about. Examples include lesson development for R workshops, my recent performance review packet, and collaborative projects on species distribution modeling. Oh, and Software Carpentry workshop websites (obviously).

Why I like Git:

The version control system and the companion remote host system (GitHub or Bitbucket or CloudForge, etc.) provide a great versioning and collaboration platform, the highlights of which have been enumerated many times over and in such depth that I won’t talk about them here.

The reason I like this dynamic duo is that it reinforces best practices. I should say that using version control won’t necessarily make you 100% compliant with everyone’s idea of best practices, but with a little consideration of a workflow, it can go a long way. Here is why:

  1. Reproducibility: By ignoring my output folder in pretty much all my Git repositories, it forces all figures & analyses to be completely reproducible from materials that are in the folders that are tracked.

  2. Offsite backup: Rather than lugging my aging laptop to and fro, pull-add-commit-push allows me to preserve my work in a location accessible from any internet-enabled terminal. This has the added benefit of protecting against natural disasters and the inevitable bricked hard drive (mark my words, death, taxes, and a failed HD are the only certainties in life now).

  3. Sharing: Sure, some of what I currently work on is not ready to be released, so I use the private repository option. But when I am ready to share my code and data, it’s literally one to two mouseclicks and my work is open for re-use by the community. The visibility of platforms like GitHub and Bitbucket make work that much more discoverable.

  4. Documentation: Everybody’s favorite part of software development is … not likely writing documentation (granted there are some of you out there). Because good documentation is imperative for re-use and evaluation, the little reminders from GitHub (“Help people interested in this repository understand your project by adding a README”) further encourage best practices for open research. The support for markdown rendering on GitHub makes it especially nice for writing professional-looking documentation of your work.

Sure, I struggled struggle sometimes with Git syntax and concepts, but 98% of the time I only use four commands (pull-add-commit-push, remember?) and the Git/GitHub combo reduces the time I spend developing, preserving, and sharing the work I do.

– Jeff Oliver, Data Science Specialist, Tucson, Arizona

Have a favorite tool of your own? Please tell us about it!

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github