We teach foundational coding and data science skills to researchers worldwide.

Call for Contributions: Moving Ahead with Genomics Data Carpentry

“Get involved as we prepare for our first Issue Bonanza and Bug BBQ on the Data Carpentry Genomics lessons”

This post originally appeared on the Data Carpentry website

Way back in March of 2015, a group of 27 Carpentry community members gathered at Cold Spring Harbor for the first Data Carpentry Hackathon on a set of lessons and assessments for Genomics Data Carpentry. From the beginning, we wanted these lessons to be driven by, and useful to the large community of biologists who are (or will be) working with large genomics dataset datasets; perhaps as many as 90% of biologists.

The creation of these lessons involved several new approaches for Data Carpentry lesson development:

  • Lessons were conceived at a Hackathon that brought together community members with a wide range of genomics expertise - from students just getting started with genomics to researchers who have spent most of their careers in the field-
  • Workshops would be taught on the cloud, acknowledging the reality that genomics research involves the use of cloud and HPC
  • Assessment should be ‘baked’ into the process; we want to think about learning objectives and how they can be assessed before we start getting lost in the details of vim vs. vi.

These innovations supported the essentials of Data Carpentry teaching. Lessons focus on principles not tools, emphasize tidy data, reproducible workflows, and are unified along a “story arc” that captures common tasks of genomics analyses.

Since the hackathon, these lessons have been taught at least 15 times in 6 countries to positive response:


Quotes from Genomics workshops participants

  • “The instructors were knowledgeable and helpful, and willing to approach things from a very basic perspective to get everyone up to speed. They covered a lot of tools with practical applications useful to anyone doing bioinformatics.”

  • “Great background material for beginners looking to get past the initial hurdle of using computational tools.”

  • “Hands on coding along with instruction. Appropriate exercises. Compelling AND approachable presenters.”

  • “Coding along with the instructors and using a sample data set was great!”

  • “The hands on nature of the workshop allows you to get a good grasp of the command line and you get practice troubleshooting any mistakes that you may encounter.”

redirect_from: /blog/genomics-lessons/

We are now at the point where there is sufficient experience to move these lessons to a more formal release that will be more easily used, updated, and maintained by the community. As with all Software and Data Carpentry lessons, these lessons are by and for the community!

What are some short- and long-term objectives?

In the short-term, we are organizing the existing lessons repos into new layouts and formatting. Shortly after the call for contributions, we will follow the successful model of having an “Issue Bonanza” to identify problems with existing materials, followed by a “Bug BBQ” where we will fix those problems. In the longer term, we would like to have an active Genomics Curriculum Committee - a Self-Organised group that helps update and maintain the lessons to keep pace with the tools and the science.

At the moment, a de facto committee of staff and instructors (Erin Becker, Bob Freeman, Kari Jordan, Mateusz Kuzak, Sue McClatchy, Maneesha Sane, Tracy Teal, Jason Williams) have identified some possible goals to work towards:

  • Offering two 2-day workshop formats (One focused on R, one focused on building genomics pipelines
  • Suggesting a comprehensive lesson sequence if workshops are Self-Organised and will be taught over more than 2 days
  • Improving how we manage our image and making it easier to reproduce on clouds outside of AWS
  • Offering a full set of lessons adapted to be taught via HPC
  • Improving assessment

How you can contribute

We are asking anyone interested in helping now (or in the future) to fill out this brief form so that we can organize the effort: Contribution Form

While experience in genomics and Data Carpentry are a plus - there are many ways to contribute even if you don’t have this background. Please circulate this link and post to others who might be interested. We be following up near the end of May 2017 to organize everyone and provide more info.

Thanks to everyone who is working to move these lessons to the next stage!

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github