Formatting Revisited

This post originally appeared on the Software Carpentry website.

David Seifreid has been working for the past week to combine deck.js, a Javascript-plus-CSS slideshow package, with the AudioJS audio player, so that we can re-do slides as pure HTML5 (instead of using PNGs exported from PowerPoint). At the same time, I'm trying to turn the course notes into long-form prose (what used to be called a "book") for people who prefer reading at leisure. How should all this content be managed? My previous post on 21st Century teaching formats described what I'd eventually like, but the tool I want doesn't exist yet, so what can be done now?

We will have:

metadata, such as keywords and topic guides;
slidescontaining
- vector diagrams,
- raster images,
- point-form text, and
- code samples
audio narration synced with the slides;
tranascripts of the narration; and
prose (the "book" stuff), which may include the same code samples and figures.

I know from experience that the transcripts of the audio will be a starting point for the book-form material, but the latter will be longer. We'll therefore have four parallel streams of data: slides, audio, narration (as text), and the book. That suggests something like this (using the topic/concept distinction I discussed a couple of weeks ago):

<section class="topic">
  <section class="metadata">
    keywords, notes to instructors, etc.
  </section>

  <audio src="..." />

  <section class="concept">

    <section class="slide" popcorn-slideshow="start time in seconds">
      Slide (or several slides if we're using progressive highlighting on a single logical slide).
    </section>

    <section class="transcript">
      Text transcript of the slide (or group of closely-related slides).
    </section>

    <section class="book">
      Long-form prose discussion of the same concept.
    </section>

  </section>

  <section class="concept">
    ...as above...
  </section>
</section>

Diagrams and images will be stored in external files and href'd in—I've played with putting the SVG directly in the document, but in practice, people are going to use different tools to edit the two anyway. I'd like to use inclusions for code fragments, so that they don't have to be duplicated in the slide and book sections, but there's no standard way to do text inclusion in HTML (which is odd when you think about it, given that other media are usually included by reference instead of by value).

The advantages of this format that I see are:

Anyone can edit it without special tools.
It's mergeable: provided people stick to a few rules about indentation and the like, it'll be a simple text merge (which is a lot easier than merging PowerPoint slides or the like).
We can re-skin it using CSS and a bit of Javascript. (For example, the default web view won't show the transcript or book form, just the slides and audio.)
It's accessible to people with visual handicaps (since related content is side-by-side and complete).
We can compile it to produce web-only or print-only versions using XSLT or the like if we want to.

Things I don't like:

I really would like to store code snippets in external files and href them as if they were diagrams or images. We can do that with a simple script, but then what you're editing and what you're looking at in your previewer (browser) will be separated by a compilation step, which in my experience always results in headaches.
Different authors' HTML editing tools will indent things differently, so we'll need to provide some sort of normalizer for them to run before doing an update or commit. It's not a big deal, but again, experience teaches that it will lead to a constant background annoyance level ("Whoops, sorry, I forgot to clean up before I committed that change").

We could use a wiki-like syntax for notes, and rely on something like Sphinx to convert that to what we need. This is the route the authors of these SciPy lectures have taken, and while it's intriguing, I don't see how to support the parallel streams we want without some serious hackage. It would also tie any processing tools we build to an idiosyncratic format (reStructuredText); HTML5 might be more typing, but it can also be crunched by almost any language people are likely to use straight out of the box.

Thoughts?