NGS Summer 2016: Analyzing Next-Generation Sequencing Data

This post originally appeared on the Software Carpentry website.

A two-week residential course on next-generation sequencing is being offered at the Kellogg Biological Station at Michigan State University on August 8-19, 2016. The course’s directors are Prof. Matt MacManes (U. New Hampshire) and Prof. Meg Staton (U. Tennessee, Knoxville), and instructors will include Prof. Ian Dworkin (McMaster U.), Prof. Torsten Seemann (U. Melbourne), Shaun Jackman (PhD candidate, UBC) and others. More information, or to register, please see

Note: if you are running a course that might be of interest to our community, please let us know.

This intensive two week summer course will introduce attendees with a strong biology background to the practice of analyzing short-read sequencing data from Illumina and other next-gen platforms (e.g., Nanopore, PacBio). The first week will introduce students to computational thinking and large-scale data analysis on UNIX platforms. The second week will focus on genome and transcriptome assembly, transcript quantitation, mapping, and other topics.

No prior programming experience is required, although familiarity with some programming concepts is helpful, and bravery in the face of the unknown is necessary. 2 years or more of graduate school in a biological science is strongly suggested. Faculty, postdocs, and research staff are more than welcome!

Students will gain practical experience in:

  • Python and bash shell scripting
  • cloud computing/Amazon EC2
  • basic software installation on UNIX
  • installing and running Trinity, BWA, Salmon, SPAdes, ABySS, Prokka and other bioinformatics tools.
  • querying mappings and evaluating assemblies

Materials from previous courses are available at under a Creative Commons/full use+reuse license.

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github