22 Months in the Making: New Genomics Curriculum Release

What goes into a major lesson update?

All of The Carpentries lessons focus on teaching core skills, rather than particular tools. But, as Greg Wilson says, “We have to use something” to teach those skills. Because we do use particular tools in our teaching, we frequently need to balance the need for lesson stability against the need to stay up to date with new technology.

In November 2017, Data Carpentry released a curriculum for working with genomic sequencing data. Work on this curriculum had started all the way back in 2014, and at the end of 2017 had reached a stable state where the lessons were ready to be taught by people outside of the original development group. This workshop teaches genomics researchers how to manage their data, access data from popular sequencing databases, automate their analysis pipelines by writing custom Bash scripts, and compute in the cloud. It’s targeted at complete novices, with no previous programming experience, and is designed to help research teams level-up their computational skills and get more done in less time.

Genomics is a fast-moving field, and in August 2017 (even before the first official release!), community members teaching this workshop started to identify places where the lessons were teaching outdated tools that learners couldn’t easily apply to their own workflows. Workshop instructors started to advocate for updating both the data set and the software used, to modernize the workshop and keep it relevant to researchers working in the genomics field. These individual proposals garnered a lot of discussion, demonstrating the need the community felt for updating the lessons.

In July 2018, a group of attendees at CarpentryConnect Davis 2018 brought together these various conversations and developed a formal proposal. In September 2018, the Data Carpentry Genomics Curriculum Advisors unanimously approved the proposed changes, which included major intellectual contributions from Adam Orr, Azalee Bostroem, Daniel Standage, Fotis Psomopoulos, Jeffrey Miller, Mike Lee, Ming Tang, Rayna M Harris, Reed A. Cartwright, Ryan Peek, Sateesh Peri, Shannon EK Joslyn, Taylor Reiter, Tessa Pierce, Thomas Sandmann, Tracy Teal, and Tristan De Buysscher.

Over the next five months, the nitty-gritty work of implementing these changes was led by Taylor Reiter, who organized local hackathons and coordinated updates with Maintainers for all five of the genomics lesson repositories. In February 2019, Taylor and the lesson Maintainers merged all of these updates, with Taylor flying across the USA to teach the new lessons the very next day! Since February, the new curriculum has been taught at least nine times, at pilot workshops in Belgium, the Netherlands, Switzerland, and the USA. Instructors and helpers at these workshops, along with other contributors, have provided masses of feedback, leading to an official release of freshly polished versions of these lessons in June 2019.

All told, this lesson rewrite took 22 months, from conception to release, and required the dedicated work of hundreds of Instructors, helpers, learners, Maintainers, Curriculum Advisors, and other contributors. The community is incredibly excited about these developments, with over 150 Instructors already expressing enthusiasm for teaching this new curriculum. So on behalf of those Instructors, future workshop hosts, and learners, I’d like to put out a huge shout of gratitude to everyone who made these ideas a reality!

What’s Next?

We hope you’re as excited about these new lessons as we are! If you’re interested in teaching, requesting a workshop, or learning more about what is included:

If you’d like to help spread the word about these lessons,

  • Post our promotional flyer on your department message board, circulate it on your department mailing list, or send it out to other genomics or bioinformatics mailing lists you’re a member of
  • Tweet about the workshop using our suggested message (or your own statement of enthusiasm!):
    • Want your team to be more efficient at working with seq data? @datacarpentry provides training to get your team from messy spreadsheets to scripting and cloud computing. 91% recommend our workshops. Learn more or book a wkshp today: https://t.co/pZFQuYtejO
    • Retweet messages about this workshop posted by @thecarpentries and @datacarpentry

Our lessons are always a work in progress. Although any major changes to these lessons are at least a year away (per the Curriculum Advisory Committee), there are plenty of opportunities to improve in the meantime! If you spot any errors, or places in the lessons where things could be more clear, please submit an issue to let the Maintainers know. We’re particularly eager to have people review and make suggestions on the instructions for launching your own AWS instances, the FAQ, and any of the Instructor Notes.

If you have any questions about how to get involved or about the Genomics curriculum, email us at team@carpentries.org.

With Many Thanks To

Curriculum Advisors

Maintainers, past and present

  • Ahmed Moustafa
  • Amanda Charbonneau
  • Anita Schürch
  • Bastian Greshak
  • Bob Freeman
  • Darya Vanichkina
  • Erin Becker
  • Fotis Psompoulos
  • Jason Williams
  • Josh Herr
  • Kevin Buckley
  • Krzysztof Poterlowicz
  • Lex Nederbragt (past Maintainer)
  • Malvika Sharan (past Maintainer)
  • Mateusz Kuzak
  • Naupaka Zimmerman
  • Peter Hoyt
  • Rayna Harris
  • Roselyn Lemus
  • Shichen Wang
  • Sue McClatchy (past Maintainer)
  • Yujuan Gui

Pilot Instructors

Many other contributors

Many people contributed to the lessons and are listed as authors in the June release on Zenodo. If you should be listed as an author, and we missed you, please let us know by emailing team@carpentries.org so we can add you!

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github